AI Dynamics

Global AI News Aggregator

About

Benchmark Joke Proves Surprisingly Effective for Model Evaluation

For a benchmark that was originally intended to be a joke I found it's surprisingly effective at quickly evaluating how good a model is!

→ View original post on X — @simonw