UltraCUA: A Foundation Computer-Use Agents Model that Bridges the Gap between General-Purpose GUI Agents and Specialized API-based Agents

Computer-use agents have been limited to primitives. They click, they type, they scroll. Long action chains amplify grounding errors and waste steps. Apple Researchers introduce UltraCUA, a foundation model that builds an hybrid action space that lets an agent interleave low level GUI actions with high level programmatic tool calls. The model chooses the cheaper…

Read More

A Coding Guide to Build a Fully Functional Multi-Agent Marketplace Using uAgent

In this tutorial, we explore how to build a small yet functional multi-agent system using the uAgents framework. We set up three agents — Directory, Seller, and Buyer — that communicate via well-defined message protocols to simulate a real-world marketplace interaction. We design message schemas, define agent behaviors, and implement request-response cycles to demonstrate discovery,…

Read More

Anthrogen Introduces Odyssey: A 102B Parameter Protein Language Model that Replaces Attention with Consensus and Trains with Discrete Diffusion

Anthrogen has introduced Odyssey, a family of protein language models for sequence and structure generation, protein editing, and conditional design. The production models range from 1.2B to 102B parameters. The Anthrogen’s research team positions Odyssey as a frontier, multimodal model for real protein design workloads, and notes that an API is in early access. https://www.biorxiv.org/content/10.1101/2025.10.15.682677v1.full.pdf…

Read More

PokeeResearch-7B: An Open 7B Deep-Research Agent Trained with Reinforcement Learning from AI Feedback (RLAIF) and a Robust Reasoning Scaffold

Pokee AI has open sourced PokeeResearch-7B, a 7B parameter deep research agent that executes full research loops, decomposes a query, issues search and read calls, verifies candidate answers, then synthesizes multiple research threads into a final response. The agent runs a research and verification loop. In research, it calls external tools for web search and…

Read More

How to Design a Fully Functional Enterprise AI Assistant with Retrieval Augmentation and Policy Guardrails Using Open Source AI Models

In this tutorial, we explore how we can build a compact yet powerful Enterprise AI assistant that runs effortlessly on Colab. We start by integrating retrieval-augmented generation (RAG) using FAISS for document retrieval and FLAN-T5 for text generation, both fully open-source and free. As we progress, we embed enterprise policies such as data redaction, access…

Read More

Google AI Introduces VISTA: A Test Time Self Improving Agent for Text to Video Generation

TLDR: VISTA is a multi agent framework that improves text to video generation during inference, it plans structured prompts as scenes, runs a pairwise tournament to select the best candidate, uses specialized judges across visual, audio, and context, then rewrites the prompt with a Deep Thinking Prompting Agent, the method shows consistent gains over strong…

Read More

How I Built an Intelligent Multi-Agent Systems with AutoGen, LangChain, and Hugging Face to Demonstrate Practical Agentic AI Workflows

In this tutorial, we dive into the essence of Agentic AI by uniting LangChain, AutoGen, and Hugging Face into a single, fully functional framework that runs without paid APIs. We begin by setting up a lightweight open-source pipeline and then progress through structured reasoning, multi-step workflows, and collaborative agent interactions. As we move from LangChain…

Read More

Google AI Research Releases DeepSomatic: A New AI Model that Identifies Cancer Cell Genetic Variants

A team of researchers from Google Research and UC Santa Cruz released DeepSomatic, an AI model that identifies cancer cell genetic variants. In research with Children’s Mercy, it found 10 variants in pediatric leukemia cells missed by other tools. DeepSomatic has a somatic small variant caller for cancer genomes that works across Illumina short reads,…

Read More

DeepSeek Just Released a 3B OCR Model: A 3B VLM Designed for High-Performance OCR and Structured Document Conversion

DeepSeek-AI released 3B DeepSeek-OCR, an end to end OCR and document parsing Vision-Language Model (VLM) system that compresses long text into a small set of vision tokens, then decodes those tokens with a language model. The method is simple, images carry compact representations of text, which reduces sequence length for the decoder. The research team…

Read More