Alternative explanation: the asymptote is due to the benchmark saturating. We need better benchmarks! But of course, there is no doubt that both open weights and closed models are pushing the envelope quite drastically. Fun times!
Benchmark Saturation Drives Need for Better AI Evaluation Metrics
By
–
Leave a Reply