And here's the little leaderboard that we maintain on IMO ProofBench in case you haven't seen it.
* Our IMO-gold model (non-public, Jul 2025) got 65.7%. * Gemini 3 Deep Think (public, Feb 2026) now got 76.7%.
* Aletheia (non-public) with inference-time scaling law +
AI Models Achieve 76.7% on IMO ProofBench Mathematics
By
–
