Just a joke, don’t take this meme too seriously and pls do rigorous evals 🙂 Explanation:
– Left: The simplest way to evaluate a language model is to play with it for 15 minutes. This is not scientific at all.
– Middle: The more systematic way is to create a diverse set of
Language Model Evaluation Methods: From Casual Testing to Rigorous Benchmarks
By
–
Leave a Reply