AI Dynamics

Global AI News Aggregator

Grok 4 Heavy Dominates Humanity’s Last Exam Benchmark

Benchmark Dominance
Grok 4 Heavy smashed “Humanity’s Last Exam” with a 44–50% score, nearly doubling its single-agent sibling and outpacing Gemini & OpenAI. It even nailed 100% on AIME! This is frontier AI territory.

→ View original post on X — @futurepedia_io,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *