AI Dynamics

Global AI News Aggregator

New LM Reasoning Benchmarks for Complex Mathematical Objects

🧮New work from @AIatMeta & @LTIatCMU! LM reasoning benchmarks mostly use simple answers like numbers (AIME) or multiple-choice options (GPQA). But for complex mathematical objects, performance drops sharply. We propose a set of solutions to solve this: arxiv.org/abs/2603.18886

→ View original post on X — @jeande_d, 2026-03-20 17:44 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *