AI

A Coding Guide to Demonstrate Targeted Data Poisoning Attacks in Deep Learning by Label Flipping on CIFAR-10 with PyTorch

In this tutorial, we demonstrate a realistic data poisoning attack by manipulating labels in the CIFAR-10 dataset and observing its impact on model behavior. We...

OpenAI has Released the ‘circuit-sparsity’: A Set of Open Tools for Connecting Weight Sparse Models and Dense Baselines through Activation Bridges

OpenAI team has released their openai/circuit-sparsity model on Hugging Face and the openai/circuit_sparsity toolkit on GitHub. The release packages the models and circuits from the...

AI Interview Series #2: Explain Some of the Common Model Context Protocol (MCP) Security Vulnerabilities

In this part of the Interview Series, we’ll look at some of the common security vulnerabilities in the Model Context Protocol (MCP) — a framework...

Google DeepMind Introduces SIMA 2, A Gemini Powered Generalist Agent For Complex 3D Virtual Worlds

Google DeepMind has released SIMA 2 to test how far generalist embodied agents can go inside complex 3D game worlds. SIMA’s (Scalable Instructable Multiworld Agent)...

Cerebras Releases MiniMax-M2-REAP-162B-A10B: A Memory Efficient Version of MiniMax-M2 for Long Context Coding Agents

Cerebras has released MiniMax-M2-REAP-162B-A10B, a compressed Sparse Mixture-of-Experts (SMoE) Causal Language Model derived from MiniMax-M2, using the new Router weighted Expert Activation Pruning (REAP) method....

RL without TD learning
November 1, 2025
In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer. Unlike traditional […]
What exactly does word2vec learn?
September 1, 2025
What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting […]
Whole-Body Conditioned Egocentric Video Prediction
July 1, 2025
.modal { display: none; position: fixed; z-index: 9999; padding-top: 50px; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; […]
Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)
April 11, 2025
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks […]
Repurposing Protein Folding Models for Generation with Latent Diffusion
April 8, 2025
PLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space […]
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment
March 25, 2025
Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion […]
Virtual Personas for Language Models via an Anthology of Backstories
November 12, 2024
We introduce Anthology, a method for conditioning LLMs to representative, consistent, and diverse virtual personas by generating and utilizing naturalistic […]
Linguistic Bias in ChatGPT: Language Models Reinforce Dialect Discrimination
September 20, 2024
Sample language model responses to different varieties of English and native speaker reactions. ChatGPT does amazingly well at communicating with […]
How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark
August 28, 2024
When we began studying jailbreak evaluations, we found a fascinating paper claiming that you could jailbreak frontier LLMs simply by […]
Are We Ready for Multi-Image Reasoning? Launching VHs: The Visual Haystacks Benchmark!
July 20, 2024
Humans excel at processing vast arrays of visual information, a skill that is crucial for achieving artificial general intelligence (AGI). […]

Trending News

AI

Home

AI