AI Dynamics

Global AI News Aggregator

About

Evaluating AI Claims: The Challenge of Secret Benchmarks

That is well-covered (see coverage of Hinton, Bengio, Musk, Altman, etc). Reporting on benchmarks and evals is possible but much of this is secretive, and peer-reviewed pubs rare, therefore hard to do. It’s not enough to say “x scientist says cancer will be cured in 10 years.”

→ View original post on X — @madhumita29