AI Dynamics

Global AI News Aggregator

About

Need for Closed LLM Evaluations and Unknown Benchmarks

Totally agree. I was just talking recently about the need for closed evaluations and unknown benchmarks. Some kind of LLM comparison tests that can't so easily be gamed.

→ View original post on X — @thatroblennon