Claude 2.5 Pro Achieves New SOTA Across AI Benchmarks

AI Dynamics

Global AI News Aggregator

Claude 2.5 Pro Achieves New SOTA Across AI Benchmarks

–

25 March 2025 18h09

2.5 Pro sets new SOTA capabilities across benchmarks, including: — 18.8% on Humanity's Last Exam — 63.8% on SWE-Bench Verified (agentic coding)
— GPQA Diamond and AIME 2025 (STEM)
— Long-context and visual reasoning

→ View original post on X — @rowancheung,

25 March 2025

AI CODE GENERATIVE AI INNOVATION LLMS MACHINE LEARNING MULTIMODAL AI RESEARCH

AI Dynamics

Claude 2.5 Pro Achieves New SOTA Across AI Benchmarks

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Cheaper exploration at scale remains advantageous despite no new exploits

Gold Status Experience Brings Satisfaction

Using ChatGPT for Essay Feedback and Improvement

Intelligence Gone Wrong: Cheating Despite Having Correct Answer