AI Dynamics

Global AI News Aggregator

About

Open Models Overfitting Benchmarks While Losing Reasoning Ability

@xeophon On the topic of swe-rebench and lower scores, another data point for you: my own analysis suggests open models are overfitting to popular patterns/benchmarks while failing to get better at logical reasoning / problem solving:

→ View original post on X — @alexjc,