AI Dynamics

Global AI News Aggregator

About

GPT-4.1 Nearly Matches Claude 3.7 on SWE Coding Benchmark

𝐆𝐏𝐓-πŸ’.𝟏 nearly surpasses Claude 3.7 in coding?! New evaluation published with our #1 agent on the SWE-coding benchmark! GPT-4.1 outperforms Gemini 2.5 Pro and comes close to the level of Claude 3.7 Sonnet! Even GPT-4.1 mini matches the performance of Claude 3.5 Sonnet

β†’ View original post on X β€” @vision_ia