prakhar@affmantra.com

Parallax: A Parameterized Local Linear Attention That Keeps Softmax and Adds a Learned Covariance Correction Branch

The Transformer’s attention mechanism has barely changed since 2017. Most efficiency work has tried to replace softmax attention outright. A new paper takes a different route. It keeps softmax attention and bolts on a correction branch. A team of researchers from Northwestern University, Tilde Research, and University of Washington introduce a parameterized Local Linear Attention…

Read More

Microsoft Surface Laptop Ultra announced, featuring 15-inch mini-LED display, NVIDIA Blackwell RTX, 128GB unified memory, and more

Microsoft has announced the Surface Laptop Ultra, its most powerful Surface device to date, aimed at developers, AI researchers, software engineers, and creative professionals. Developed in collaboration with NVIDIA, the new laptop is designed to handle demanding workloads, including large-scale AI models, software development, 3D rendering, and content creation, while maintaining the portability of a…

Read More

A Developer’s Guide to Systematic Prompting: Mastering Negative Constraints, Structured JSON Outputs, and Multi-Hypothesis Verbalized Sampling

Most developers treat prompting as an afterthought—write something reasonable, observe the output, and iterate if needed. That approach works until reliability becomes critical. As LLMs move into production systems, the difference between a prompt that usually works and one that works consistently becomes an engineering concern. In response, the research community has formalized prompting into…

Read More

RightNow AI Releases AutoKernel: An Open-Source Framework that Applies an Autonomous Agent Loop to GPU Kernel Optimization for Arbitrary PyTorch Models

Writing fast GPU code is one of the most grueling specializations in machine learning engineering. Researchers from RightNow AI want to automate it entirely. The RightNow AI research team has released AutoKernel, an open-source framework that applies an autonomous LLM agent loop to GPU kernel optimization for arbitrary PyTorch models. The approach is straightforward: give…

Read More

Elon Musk pauses changes to X’s creator revenue-sharing program after backlash

Social media platform X swiftly backtracked on its announcement regarding new rules for creator monetization, which had focused on payouts based on engagement from a creator’s local audience. Late Tuesday, X Head of Product Nikita Bier announced that, starting Thursday, the platform will change its policy around payouts and will give more emphasis to impressions…

Read More