Building a Context-Folding LLM Agent for Long-Horizon Reasoning with Memory Compression and Tool Use

In this tutorial, we explore how to build a Context-Folding LLM Agent that efficiently solves long, complex tasks by intelligently managing limited context. We design the agent to break down a large task into smaller subtasks, perform reasoning or calculations when needed, and then fold each completed sub-trajectory into concise summaries. By doing this, we…

Read More

QeRL: NVFP4-Quantized Reinforcement Learning (RL) Brings 32B LLM Training to a Single H100—While Improving Exploration

What would you build if you could run Reinforcement Learning (RL) post-training on a 32B LLM in 4-bit NVFP4—on a single H100—with BF16-level accuracy and 1.2–1.5× step speedups? NVIDIA researchers (with collaborators from MIT, HKU, and Tsinghua) have open-sourced QeRL (Quantization-enhanced Reinforcement Learning), a training framework that pushes Reinforcement Learning (RL) post-training into 4-bit FP4…

Read More

Ivy Framework Agnostic Machine Learning Build, Transpile, and Benchmark Across All Major Backends

In this tutorial, we explore Ivy’s remarkable ability to unify machine learning development across frameworks. We begin by writing a fully framework-agnostic neural network that runs seamlessly on NumPy, PyTorch, TensorFlow, and JAX. We then dive into code transpilation, unified APIs, and advanced features like Ivy Containers and graph tracing, all designed to make deep…

Read More