Looking forward to speaking at this tomorrow! And maybe reveal a bit what the 10th paper that #Aletheia has helped mathematicians (autonomously) tackle is 🙂
@lmthang
-
AI Math Capabilities: Conservative Assessment at L2+ Level
By
–
Maybe we should calibrate together. We (generally @tonylfeng
) are more on the conservative side that results in AI for Maths are mostly at L2+ (for now). We haven't seen any L3 result yet. -

Shifting AI evaluation metrics beyond Level 2 results
By
–
Yes, we will soon stop counting Level-2 results & shift toward Levels 3 & 4 that were laid out in the Aletheia paper.
-

AI Helps Mathematicians: 9th Research Paper Success
By
–
9th math research paper that #Aletheia has helped mathematicians! Number 10 is coming soon! Quoting @tonylfeng on this:
"Cool new paper by Thomas Krämer, Daniel Litt, Marco Maculan, featuring a (small) contribution from Gemini Deep Think and Aletheia. … AI is very useful for -

First Agent Skills Workshop at CAIS 2026 Conference
By
–
We are hosting the first ever Agent Skills Workshop at CAIS 2026. Submit your cool papers and demos. If you don't know what CAIS is. You are missing out. It's gonna be one of the most high signal conference in the bay this year. What's more: @swyx's @aiDotEngineer world fair is partnering with it. Its committee: @gneubig @ChenLingjiao @JeffDean @lateinteraction @MonicaSLam @lmthang @pirroh @ChrisGPotts @NaveenGRao @dawnsongtweets and @istoica05
-
ARC-AGI-3: New Benchmark Shows AI Lacks True Learning Ability
By
–
Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn
-
OpenProver v1.0.0: Open-Source Automated Theorem Prover Released
By
–
I'm releasing OpenProver v1.0.0!
— Matěj Kripner (@MatejKripner) 23 mars 2026
It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean. pic.twitter.com/xMTHWfIL31I'm releasing OpenProver v1.0.0! It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean.
-
AI-Empowered Mathematicians Achieve H2 Level Research Work
By
–
And shoutout to this independent work by @JulianSlzr and team on a Level 2 work, in the "Primarily Human" category, with the help of AlphaEvolve and DeepThink! nitter.net/JulianSlzr/status/2034… Julian Salazar (@JulianSlzr) We're AI researchers @GoogleDeepMind who last did math full-time over 9 years ago. Despite our rustiness and limited time, AI empowered us to do some niche theory-building (@littmath). Per Aletheia's taxonomy, our work is H2 (primarily human)… for now! nitter.net/lmthang/status/2021644… — https://nitter.net/JulianSlzr/status/2034947452005228627#m
-

AI Accelerates Mathematical and Scientific Discovery with Gemini DeepThink
By
–
And before that 6 other math research papers by #Aletheia and other discoveries in physics and computer science, all of which were powered by Gemini #DeepThink! nitter.net/lmthang/status/2021631… Thang Luong (@lmthang) 6 months in, after the IMO-gold achievement, I’m very excited to share another important milestone: AI can help accelerate knowledge discovery in mathematics, physics, and computer science! We’re sharing Two new papers from @GoogleDeepMind and @GoogleResearch that explore how Gemini #DeepThink together with agentic workflows can empower mathematicians and scientists to tackle professional research problems. Some highlights: The first paper built a research agent #Aletheia, powered by an advanced version of Gemini Deep Think, that can autonomously produce publishable math research and crack open Erdős problems. The second paper, built on similar agentic reasoning ideas, helped resolve bottlenecks in 18 research problems, across algorithms, ML and combinatorial optimization, information theory and economics. See the thread for details about the two papers and the joint blog post. — https://nitter.net/lmthang/status/2021631397614731563#m
-

Aletheia solves 7th FirstProof problem at publishable research level
By
–
Tackling FirstProof was our 7th math research paper, which was done autonomously and the solution to problem #7 is at Level 2 "Publishable Research" as well. nitter.net/lmthang/status/2026689… Thang Luong (@lmthang) Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇 — https://nitter.net/lmthang/status/2026689272456294850#m
