How to Build a Model-Native Agent That Learns Internal Planning, Memory, and Multi-Tool Reasoning Through End-to-End Reinforcement Learning

In this tutorial, we explore how an agent can internalize planning, memory, and tool use within a single neural model rather than relying on external orchestration. We design a compact, model-native agent that learns to perform arithmetic reasoning tasks through reinforcement learning. By combining a stage-aware actor-critic network with a curriculum of increasingly complex environments,…

Read More

Replika founder raises $20M pre-seed for Wabi, the ‘YouTube of apps’ 

Eugenia Kuyda saw the future of consumer AI before most. She founded Replika, the first major AI companion startup, in 2017 years before ChatGPT launched. Today it has 35 million users.   Now Kuyda is back with a new startup called Wabi, which she describes as YouTube for apps – a social platform where anyone can use prompts to instantly create mini apps and share…

Read More

Google AI Introduces Consistency Training for Safer Language Models Under Sycophantic and Jailbreak Style Prompts

How can consistency training help language models resist sycophantic prompts and jailbreak style attacks while keeping their capabilities intact? Large language models often answer safely on a plain prompt, then change behavior when the same task is wrapped with flattery or role play. DeepMind researchers propose consistent training in a simple training lens for this…

Read More