A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models
AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit distinct behavioral profiles under the same spec? A team of researchers from Anthropic, Thinking Machines Lab and Constellation present a systematic method that stress tests model specs…
