What production performance are you referring to then? Or are you referring to academic benchmarks being bad in general? Because you would always need some eval benchmark ideally correlated to the prod use case.
Production Performance vs Academic Benchmarks Eval Correlation
By
–
Leave a Reply