Skip to content

AI Dynamics

Global AI News Aggregator

Rechercher

@lmthang

Aletheia AI Agent Tackles 10th Mathematics Paper Autonomously

By

@lmthang

–

01 May 2026 18h05

Looking forward to speaking at this tomorrow! And maybe reveal a bit what the 10th paper that #Aletheia has helped mathematicians (autonomously) tackle is 🙂

→ View original post on X — @lmthang,

1 May 2026
AI Math Capabilities: Conservative Assessment at L2+ Level

By

@lmthang

–

29 April 2026 18h06

Maybe we should calibrate together. We (generally @tonylfeng
) are more on the conservative side that results in AI for Maths are mostly at L2+ (for now). We haven't seen any L3 result yet.

→ View original post on X — @lmthang,

29 April 2026
Shifting AI evaluation metrics beyond Level 2 results

By

@lmthang

–

29 April 2026 8h53

Yes, we will soon stop counting Level-2 results & shift toward Levels 3 & 4 that were laid out in the Aletheia paper.

→ View original post on X — @lmthang,

29 April 2026
AI Helps Mathematicians: 9th Research Paper Success

By

@lmthang

–

28 April 2026 19h03

9th math research paper that #Aletheia has helped mathematicians! Number 10 is coming soon! Quoting @tonylfeng on this:
"Cool new paper by Thomas Krämer, Daniel Litt, Marco Maculan, featuring a (small) contribution from Gemini Deep Think and Aletheia. … AI is very useful for

→ View original post on X — @lmthang,

28 April 2026
First Agent Skills Workshop at CAIS 2026 Conference

By

@lmthang

–

03 April 2026 9h09

We are hosting the first ever Agent Skills Workshop at CAIS 2026. Submit your cool papers and demos. If you don't know what CAIS is. You are missing out. It's gonna be one of the most high signal conference in the bay this year. What's more: @swyx's @aiDotEngineer world fair is partnering with it. Its committee: @gneubig @ChenLingjiao @JeffDean @lateinteraction @MonicaSLam @lmthang @pirroh @ChrisGPotts @NaveenGRao @dawnsongtweets and @istoica05

→ View original post on X — @lmthang, 2026-04-03 07:09 UTC

3 April 2026
ARC-AGI-3: New Benchmark Shows AI Lacks True Learning Ability

By

@lmthang

–

25 March 2026 18h37

Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn

→ View original post on X — @lmthang, 2026-03-25 17:37 UTC

25 March 2026
OpenProver v1.0.0: Open-Source Automated Theorem Prover Released

By

@lmthang

–

23 March 2026 19h25

I'm releasing OpenProver v1.0.0!

It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean. pic.twitter.com/xMTHWfIL31
— Matěj Kripner (@MatejKripner) 23 mars 2026

I'm releasing OpenProver v1.0.0! It's 1) an open-source automated theorem prover inspired by DeepMind's Aletheia (@tonylfeng @gjb_ai @lmthang), and 2) a "Claude Code for mathematicians", allowing interactive proof search in English and formalization in Lean.

→ View original post on X — @lmthang, 2026-03-23 18:25 UTC

23 March 2026
AI-Empowered Mathematicians Achieve H2 Level Research Work

By

@lmthang

–

20 March 2026 19h49

And shoutout to this independent work by @JulianSlzr and team on a Level 2 work, in the "Primarily Human" category, with the help of AlphaEvolve and DeepThink! nitter.net/JulianSlzr/status/2034… Julian Salazar (@JulianSlzr) We're AI researchers @GoogleDeepMind who last did math full-time over 9 years ago. Despite our rustiness and limited time, AI empowered us to do some niche theory-building (@littmath). Per Aletheia's taxonomy, our work is H2 (primarily human)… for now! nitter.net/lmthang/status/2021644… — https://nitter.net/JulianSlzr/status/2034947452005228627#m

→ View original post on X — @lmthang, 2026-03-20 18:49 UTC

20 March 2026
AI Accelerates Mathematical and Scientific Discovery with Gemini DeepThink

By

@lmthang

–

20 March 2026 19h39

And before that 6 other math research papers by #Aletheia and other discoveries in physics and computer science, all of which were powered by Gemini #DeepThink! nitter.net/lmthang/status/2021631… Thang Luong (@lmthang) 6 months in, after the IMO-gold achievement, I’m very excited to share another important milestone: AI can help accelerate knowledge discovery in mathematics, physics, and computer science! We’re sharing Two new papers from @GoogleDeepMind and @GoogleResearch that explore how Gemini #DeepThink together with agentic workflows can empower mathematicians and scientists to tackle professional research problems. Some highlights: The first paper built a research agent #Aletheia, powered by an advanced version of Gemini Deep Think, that can autonomously produce publishable math research and crack open Erdős problems. The second paper, built on similar agentic reasoning ideas, helped resolve bottlenecks in 18 research problems, across algorithms, ML and combinatorial optimization, information theory and economics. See the thread for details about the two papers and the joint blog post. — https://nitter.net/lmthang/status/2021631397614731563#m

→ View original post on X — @lmthang, 2026-03-20 18:39 UTC

20 March 2026
Aletheia solves 7th FirstProof problem at publishable research level

By

@lmthang

–

20 March 2026 19h28

Tackling FirstProof was our 7th math research paper, which was done autonomously and the solution to problem #7 is at Level 2 "Publishable Research" as well. nitter.net/lmthang/status/2026689… Thang Luong (@lmthang) Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇 — https://nitter.net/lmthang/status/2026689272456294850#m

→ View original post on X — @lmthang, 2026-03-20 18:28 UTC

20 March 2026

1 2 3 … 16

INNOVATION GENERATIVE AI RESEARCH LLMS TOOLS MACHINE LEARNING CODE MARKET TRENDS BUSINESS BIG TECH TECHNOLOGY ETHICS ENTERPRISE AI APPS SOFTWARE DATA COMPUTING AGENTS AUTOMATION POLICY OPEN SOURCE CULTURE REGULATION ECONOMY MULTIMODAL AI SOCIETY INVESTMENT CREATIVE AI EDUCATION AI HARDWARE SAFETY HARDWARE JOBS AGI PROMPT ENGINEERING STARTUPS INDUSTRY ROBOTICS WORKFORCE SECURITY CYBERSECURITY HEALTHCARE AI SYSTEMS SUSTAINABILITY WEB3 DECENTRALIZED AI

AI Dynamics

Global AI News Aggregator

About
Archives

Rechercher