Fast progress in training AI agents to interact with the world. Training on just 2,541 hours of Minecraft video, Google built an AI that runs on a single GPU & was able to mine diamonds offline (which takes an average of 24,000 clicks). The same approach may work for AI robots.
@emollick
-
Constraints Drive Progress: AI’s Current Technical Bottlenecks
By
–
Part of why “wall/no wall” is not a useful distinction. Walls block progress, but they also concentrate effort in ways that can result in rapid improvement. See also progress on hallucinations. Current reverse salients: continual learning, pro-activity, effective memory…
-
AI’s Jagged Frontier: Math Planning as Reverse Salients
By
–
It turns out that the AI jagged frontier worked as a reverse salient, a term from the history of science for a technology or process that holds back the whole system & thus a focus of development. Math & planning were reverse salients, so they have seen the most improvement.
-
Excel Agent Mode: Microsoft’s New AI-Powered Frontier Feature
By
–
Here's the instructions for trying it: https://
support.microsoft.com/en-us/office/a
gent-mode-in-excel-frontier-a2fd6fe4-97ac-416b-b89a-22f4d1357c7a
… -
Microsoft’s New Excel Agent Surpasses Copilot Capabilities
By
–
I've been playing with the new Excel agent and it seems like Microsoft is taking a leap past copilots It feels still in development, but it does autonomous Excel work much better than Microsoft's Copilot approach, which it effectively kills (with unclear implications for work)
-
Claude Sonnet 4.5 Shows Verbal Cleverness With Creative Prompt Mashup
By
–
Claude Sonnet 4.5 continues the tradition of Claude verbal cleverness. For fun, I gave it this very random prompt: “Mash these up into a fine paste: [I quote the final line of 100 Years of Solitude] And: 10 PRINT “HELLO WORLD”
20 GOTO 10” Lots of smart bits in the answer. -
AI Agents Now Capable of Real Valuable Work
By
–
AI agents are now capable of doing real, if bounded, work. But that work can be very valuable. For example, the new Claude Sonnet 4.5 was able to replicate published economics research from data files & the paper. We need to figure out what to do with it:
-
AI System Solves CAPTCHA Beyond Given Instructions
By
–
Admittedly they are also somewhat easy to convince. (funnily enough, it solved the CAPTCHA, not my instruction).
-
LLMs Refuse CAPTCHAs Despite Superior Solving Ability
By
–
Its kind of funny that AI can definitely do most common CAPTCHAs better than humans and the reason that CAPTCHAs still work is because the big LLMs often refuse to do them.
-
ChatGPT Codex Recreates Lost Maxis SimRefinery Game
By
–
I gave ChatGPT Codex an article & screenshot from a famous, lost Maxis simulation, SimRefinery, and asked it to create it for me.
— Ethan Mollick (@emollick) 27 septembre 2025
It built a playable prototype for me while I did other stuff, never touching any code once, instead just occasionally poking Codex for small changes. pic.twitter.com/SZi5HMAArwI gave ChatGPT Codex an article & screenshot from a famous, lost Maxis simulation, SimRefinery, and asked it to create it for me. It built a playable prototype for me while I did other stuff, never touching any code once, instead just occasionally poking Codex for small changes.