Thrilled to announce the @Kaggle Game Arena, a new leaderboard testing how modern LLMs perform on games (spoiler: not very well atm!). AI systems play each other, making it an objective & evergreen benchmark that will scale in difficulty as they improve.
@demishassabis
-
Games as AI Testing Grounds: Arena Benchmarks Drive Progress
By
–
Games have always been a useful proving ground for AI (including our own work on AlphaGo & AlphaZero) and we're excited to see the progress this benchmark will drive as we add more games and challenges to the Arena – we expect to see rapid improvement!
-
AGI Societal Impact: Major Changes Ahead Discussion
By
–
Recently had a great conversation with @StevenLevy @WIRED about the societal implications of AGI, a lot of things are about to change dramatically:
-
Quadrillion tokens processed: AI infrastructure milestone
By
–
You know what's cool… a quadrillion tokens. We processed almost 1,000,000,000,000,000 tokens last month, more than double the amount from May.
-
Wide-ranging conversation on AGI, AI, gaming, and scientific advancement
By
–
Thanks @lexfridman for another super fun & wide-ranging conversation. We talked about the future of video games, the nature of reality, advancing science with AI, the path to AGI… and quite a bit more as usual! Always a blast, already looking forward to next time! 😀 https://t.co/NiXcrPhWdb
— Demis Hassabis (@demishassabis) 24 juillet 2025Thanks @lexfridman for another super fun & wide-ranging conversation. We talked about the future of video games, the nature of reality, advancing science with AI, the path to AGI… and quite a bit more as usual! Always a blast, already looking forward to next time!
-
Aeneas AI Model Unlocks Insights in Ancient Inscriptions Research
By
–
Our Aeneas AI model gives historians valuable new insights into ancient inscriptions & ancient history that may have taken years to uncover otherwise. Published in @Nature today:
-
Free AI Tool Helps Teachers Teach History Through Technology
By
–
We’ve made it available for free at http://
predictingthepast.com And we’ve co-designed a new syllabus to help teachers share the incredible potential of this technology to accelerate & expand our understanding of history: -
AI System Achieves First Official Gold Performance Grade IMO
By
–
We've now been given permission to share our results and are pleased to have been part of the inaugural cohort to have our model results officially graded and certified by IMO coordinators and experts, receiving the first official gold-level performance grading for an AI system!
-
Gemini Deep Think Achieves Advanced Reasoning Capabilities
By
–
We achieved this year’s impressive result using an advanced version of Gemini Deep Think (an enhanced reasoning mode for complex problems). Our model operated end-to-end in natural language, producing rigorous mathematical proofs directly from the official problem descriptions –
-
AI Lab Respects IMO Board Request on Results Announcement
By
–
Btw as an aside, we didn’t announce on Friday because we respected the IMO Board's original request that all AI labs share their results only after the official results had been verified by independent experts & the students had rightly received the acclamation they deserved