AI Dynamics

Global AI News Aggregator

About

Claude Opus 4.1 Achieves Expert-Level Performance on GDPval

On GDPval, expert graders compared outputs from leading models to human expert work. Claude Opus 4.1 delivered the strongest results, with just under half of its outputs rated as good as or better than expert work. Just as striking is the pace of progress: OpenAI’s frontier

→ View original post on X — @openai