We have been shipping Community Evals & Benchmark Datasets: Benchmark datasets host benchmark leaderboards, you can now contribute eval results by opening a PR to model repositories, all PRs are fed to benchmark datasets Chat with datasets: agents live in Data
Community Evals and Benchmark Datasets for AI Models
By
–
