2.5 Pro sets new SOTA capabilities across benchmarks, including: — 18.8% on Humanity's Last Exam — 63.8% on SWE-Bench Verified (agentic coding)
— GPQA Diamond and AIME 2025 (STEM)
— Long-context and visual reasoning
Claude 2.5 Pro Achieves New SOTA Across AI Benchmarks
By
–
Leave a Reply