my guess
likely lots of SAT questions in the training corpus, including many purchased training exams and test guides; this q is not there
the contamination measure in the GPT-4 paper is too crude, looking for verbatim matches, without controlling eg for close paraphrasing
Training Data Contamination and GPT-4 Evaluation Limitations
By
–
Leave a Reply