With OpenAI o1, it’s very clear we need more advanced and unsaturated evals to realistically measure progress going forward. Scale AI will have a big announcement later this week… stay tuned In the meantime, I’m back home in New Mexico, looking for eval inspiration
OpenAI o1 Requires Advanced Evals for Progress Measurement
By
–
Leave a Reply