AI Dynamics

Global AI News Aggregator

About

Three-month-old Anthropic model achieves SOTA on code maintainability

~3mo old model is still SOTA and was +8% vs 5.3-codex on maintainability total anthropic victory Gabe Orlanski (@GOrlanski) We found that agents generate progressively worse code with each iteration. Real developers do not. SlopCodeBench is the only eval that faithfully measures quality degradation on iterative, long-horizon coding tasks. arxiv.org/abs/2603.24755 scbench.ai 🧵 — https://nitter.net/GOrlanski/status/2037560777356238881#m

→ View original post on X — @alexjc, 2026-03-29 00:25 UTC