AI Dynamics

Global AI News Aggregator

About

Opinion on DeepSeek R1, o1, and suspicious o3 benchmarks

My personal vibes based opinion on this gap – at DeepSeek R1 level I believe this was real – o1 and r1 were not that far apart. From o3 onwards, I think there's something fishy going on with the benchmarks. Open models are not bad and certainly getting better, but the utility

→ View original post on X — @petergostev