Why does everyone get so excited about performance on some benchmarks when we don't know what the training data is? How do you know that they're not testing on the training set? How do you reject papers that don't tell you the splits but get so excited by these PR announcements?
Benchmarking Credibility: Training Data Transparency in AI Models
By
–
Leave a Reply