AI Dynamics

Global AI News Aggregator

ARC-AGI-3 benchmark released, frontier models underperform humans

1. Incredible. 2. I give it four months before this is ~saturated. François Chollet (@fchollet) ARC-AGI-3 is out now! We've designed the benchmark to evaluate agentic intelligence via interactive reasoning environments. Beating ARC-AGI-3 will be achieved when an AI system matches or exceeds human-level action efficiency on all environments, upon seeing them for the first time. We've done extensive human testing that shows 100% of these environments are solvable by humans, upon first contact, with no prior training and no instructions. Meanwhile, all frontier AI reasoning models do under 1% at this time. — https://nitter.net/fchollet/status/2036861192619384989#m

→ View original post on X — @mattshumer_, 2026-03-25 22:30 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *