We are hosting the first ever Agent Skills Workshop at CAIS 2026. Submit your cool papers and demos. If you don't know what CAIS is. You are missing out. It's gonna be one of the most high signal conference in the bay this year. What's more: @swyx's @aiDotEngineer world fair is partnering with it. Its committee: @gneubig @ChenLingjiao @JeffDean @lateinteraction @MonicaSLam @lmthang @pirroh @ChrisGPotts @NaveenGRao @dawnsongtweets and @istoica05
@lmthang
-
ARC-AGI-3: New Benchmark Shows AI Lacks True Learning Ability
By
–
Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn
-
OpenProver v1.0.0: Open-Source Automated Theorem Prover Released
By
–
I'm releasing OpenProver v1.0.0!
— Matěj Kripner (@MatejKripner) 23 mars 2026
It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean. pic.twitter.com/xMTHWfIL31I'm releasing OpenProver v1.0.0! It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean.
-
AI-Empowered Mathematicians Achieve H2 Level Research Work
By
–
And shoutout to this independent work by @JulianSlzr and team on a Level 2 work, in the "Primarily Human" category, with the help of AlphaEvolve and DeepThink! nitter.net/JulianSlzr/status/2034… Julian Salazar (@JulianSlzr) We're AI researchers @GoogleDeepMind who last did math full-time over 9 years ago. Despite our rustiness and limited time, AI empowered us to do some niche theory-building (@littmath). Per Aletheia's taxonomy, our work is H2 (primarily human)… for now! nitter.net/lmthang/status/2021644… — https://nitter.net/JulianSlzr/status/2034947452005228627#m
-

AI Accelerates Mathematical and Scientific Discovery with Gemini DeepThink
By
–
And before that 6 other math research papers by #Aletheia and other discoveries in physics and computer science, all of which were powered by Gemini #DeepThink! nitter.net/lmthang/status/2021631… Thang Luong (@lmthang) 6 months in, after the IMO-gold achievement, I’m very excited to share another important milestone: AI can help accelerate knowledge discovery in mathematics, physics, and computer science! We’re sharing Two new papers from @GoogleDeepMind and @GoogleResearch that explore how Gemini #DeepThink together with agentic workflows can empower mathematicians and scientists to tackle professional research problems. Some highlights: The first paper built a research agent #Aletheia, powered by an advanced version of Gemini Deep Think, that can autonomously produce publishable math research and crack open Erdős problems. The second paper, built on similar agentic reasoning ideas, helped resolve bottlenecks in 18 research problems, across algorithms, ML and combinatorial optimization, information theory and economics. See the thread for details about the two papers and the joint blog post. — https://nitter.net/lmthang/status/2021631397614731563#m
-

Aletheia solves 7th FirstProof problem at publishable research level
By
–
Tackling FirstProof was our 7th math research paper, which was done autonomously and the solution to problem #7 is at Level 2 "Publishable Research" as well. nitter.net/lmthang/status/2026689… Thang Luong (@lmthang) Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇 — https://nitter.net/lmthang/status/2026689272456294850#m
-

Mathematician and AI Collaborate on Mathematical Proof Construction
By
–
The mathematician, Anand, had the intuition (using two involutions) but couldn't quite assemble the proof. Aletheia built the exact construction required to bring it all together! 🧠 For full transparency, Human-AI Interaction (HAI) card and the transcript are included! Paper: arxiv.org/abs/2603.19052 Transcript: github.com/google-deepmind/s…
-

Aletheia AI Powers 8 Math Research Papers, Solves Hodge Bundle Problem
By
–

Update: #Aletheia has now powered 8 math research papers! 📈 Our most recent success, “The Simplicity of the Hodge Bundle,” was solved fully autonomously. It’s at Level 2 “Publishable research” per our categorization. More in thread! Tony Feng (@tonylfeng) A few months ago I bumped into Anand Patel, who had been my algebraic geometry TA in college, visiting Google DeepMind. He agreed to try out an agent I was building called Aletheia. Fast forward: Anand prompted Aletheia to solve a problem about simplicity of the Hodge bundle on M_g that had been floating around (a part of) the algebraic geometry community for at least ten years. Check out his paper at arxiv.org/pdf/2603.19052 — https://nitter.net/tonylfeng/status/2035003908993819019#m
-
Aletheia Autonomous Math Research Begins New Era AI Discovery
By
–
This is just the beginning for #Aletheia and autonomous math research. We’re excited to keep pushing the boundaries in AI for knowledge discovery responsibly and transparently! Thanks to the #FirstProof team for a brilliant challenge! 🚀 Learn more about Aletheia here: nitter.net/lmthang/status/2026689…
-
OpenAI and Google DeepMind Compete on Mathematics
By
–
Key Excerpt: "OpenAI claims it answered half… while Google DeepMind scored 6/10… Google's #Aletheia uses a computationally intensive Gemini paired with a verification algorithm… mathematicians are still impressed." Article (paywalled): newscientist.com/article/251… [Translated from EN to English]
