AI Dynamics

Global AI News Aggregator

About

Model Comparison Methodologies and API Evaluation Fairness

The same can be done with any model. This is why researchers specify n-shot, CoT, etc. The model behind an API will be super useful and could even be better than any other model (in theory). But this comparison is wrong nonetheless.

→ View original post on X — @maximelabonne