AI Dynamics

Global AI News Aggregator

Qwen 3 ARC-AGI score cannot be reproduced independently

Please note, we're not able to reproduce the 41.8% ARC-AGI-1 score claimed by the latest Qwen 3 release — neither on the public eval set nor on the semi-private set. The numbers we're seeing are in line with other recent base models. In general, only rely on scores verified by

→ View original post on X — @fchollet,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *