1/ We are launching SEAL Leaderboards—private, expert evaluations of leading frontier models. Our design principles:
Private + Unexploitable. No overfitting on evals!
Domain Expert Evals
Continuously Updated w/new Data and Models Read more in http://
scale.com/leaderboard
Scale Launches SEAL Leaderboards for Frontier Model Evaluation
By
–
