AI Dynamics

Global AI News Aggregator

@lmthang

Math Agent Aletheia Featured in FirstProof Challenge by New Scientist

By

@lmthang

–

10 March 2026 23h31

“Mathematics is undergoing the biggest change in its history.” Glad to see our math agent #Aletheia featured by @newscientist on its results at the FirstProof inaugural challenge, alongside other interesting milestones in AI for Math research! Implications worth thinking about. Link in thread.

→ View original post on X — @lmthang, 2026-03-10 22:31 UTC

10 March 2026
Nano Banana 2 debuts at number one in Image Arena

By

@lmthang

–

26 February 2026 19h57

And it's still only Feb, 2026 … Arena.ai (@arena) 🚨BREAKING: Nano Banana 2 debuts at #1 in Image Arena, and it changes the game again 🍌🍌 Officially released as Gemini 3.1 Flash Image Preview, it is powered by real-time information and images from web search. Highlights: – #1 Text-to-Image scoring 1279, surpassing GPT-Image-1.5 and Nano Banana Pro – Ties for #1 Single-Image Edit, scoring 1407 on par with ChatGPT-Image-Latest – Top 3 Multi-Image Edit, alongside Nano Banana Pro variants – $0.067 per image ~2x cheaper than Nano Banana Pro Congrats to the @GoogleDeepMind team for continuing to push the frontier! — https://nitter.net/arena/status/2027053222876393703#m

→ View original post on X — @lmthang, 2026-02-26 18:57 UTC

26 February 2026
HAI Card for FirstProof: Transparent AI Knowledge Discovery Documentation

By

@lmthang

–

26 February 2026 1h10

Again, we encourage the community in AI for knowledge discovery to transparently document their works. Below is our Human-AI Interaction (HAI) card for FirstProof. This concept was introduced in our #Aletheia paper, arxiv.org/abs/2602.10177, inspired by ML model card!

→ View original post on X — @lmthang, 2026-02-26 00:10 UTC

26 February 2026
AI Breakthrough: Aletheia Solves Mathematical and Scientific Discovery Problems

By

@lmthang

–

25 February 2026 17h46

For more information about Aletheia, powered by Gemini #DeepThink, and our works in AI for mathematical and scientific discovery at @GoogleDeepMind and @GoogleResearch, check out our announcement last week nitter.net/lmthang/status/2021631…! Thang Luong (@lmthang) 6 months in, after the IMO-gold achievement, I’m very excited to share another important milestone: AI can help accelerate knowledge discovery in mathematics, physics, and computer science! We’re sharing Two new papers from @GoogleDeepMind and @GoogleResearch that explore how Gemini #DeepThink together with agentic workflows can empower mathematicians and scientists to tackle professional research problems. Some highlights: The first paper built a research agent #Aletheia, powered by an advanced version of Gemini Deep Think, that can autonomously produce publishable math research and crack open Erdős problems. The second paper, built on similar agentic reasoning ideas, helped resolve bottlenecks in 18 research problems, across algorithms, ML and combinatorial optimization, information theory and economics. See the thread for details about the two papers and the joint blog post. — https://nitter.net/lmthang/status/2021631397614731563#m

→ View original post on X — @lmthang, 2026-02-25 16:46 UTC

25 February 2026
FirstProof Problem 7 Solution Confirmed by Original Mathematician

By

@lmthang

–

25 February 2026 17h45

The correctness of our solution to FirstProof problem 7 is also confirmed by Jim Fowler, the mathematician who conjectured the question originally! See github.com/google-deepmind/s… for all our transcripts and solutions (both correct and incorrect ones!) as well as public discussion of P7 at icarm.zulipchat.com/#narrow/….

→ View original post on X — @lmthang, 2026-02-25 16:45 UTC

25 February 2026
Aletheia AI Solves Open Math Problem P7 Successfully

By

@lmthang

–

25 February 2026 17h34

This is a remarkable milestone in which our agent can work on a research problem for a very long time, then come back and tell us if it has succeeded or failed! We visualize the inference cost Aletheia decided to spend on each candidate solution (as a multiple of the inference cost of for solving Erdős-1051, see our previous work nitter.net/lmthang/status/2018354…). P7 is extremely interesting. It has been an open problem for several years, and nobody else came close to solving it in the FirstProof contest per @tonylfeng. We initially thought Aletheia had no chance; turned out it was right! Aletheia spent most compute on P7, 16x amount we used for Erdős-1051. Remarkably, per @kimshmath, "This was the first case that I have ever seen that an AI applies several deep mathematical results (by Cartan/Leray/Borel/Atiyah/Quillen/Novikov/Kasparov…) flawlessly. It is a very unique instance."

→ View original post on X — @lmthang, 2026-02-25 16:34 UTC

25 February 2026
Aletheia solves 6 of 10 FirstProof problems using Gemini DeepThink

By

@lmthang

–

25 February 2026 17h08

We ran two Aletheia versions (differing only by base model) powered by Gemini #DeepThink. Together, they solved 6/10 problems (2, 5, 7, 8, 9, 10) per majority expert assessments. Full transparency on our FirstProof interpretation and experiments: arxiv.org/abs/2602.21201. Evaluation is extremely hard! Only a handful of experts can even understand these problems. As such, we have conducted our study very carefully! Crucially, our solutions were generated without any human intervention and submitted within the timeframe of the FirstProof challenge. The lead author of FirstProof confirmed that fact in the public Zulip discussion of our solutions icarm.zulipchat.com/#narrow/….

→ View original post on X — @lmthang, 2026-02-25 16:08 UTC

25 February 2026
Aletheia Math Agent Solves Hard FirstProof Problems Autonomously

By

@lmthang

–

25 February 2026 17h02

Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇

→ View original post on X — @lmthang, 2026-02-25 16:02 UTC

25 February 2026
Gemini 3.1 Pro Meme Video

By

@lmthang

–

19 February 2026 22h34

Gemini 3.1 Pro be like pic.twitter.com/ZwCauGxLar
— Google (@Google) 19 février 2026

Gemini 3.1 Pro be like

→ View original post on X — @lmthang, 2026-02-19 21:34 UTC

19 February 2026