AI Dynamics

Global AI News Aggregator

Data Decontamination and Generalization in Advanced Math Benchmarks

That's a valid point. The team tried hard to decontaminate the data. Also note that there's generalization into more challenging benchmarks such as HiddenMath and IMO-Bench.

→ View original post on X — @lmthang,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *