While human expert evaluation remains the gold standard for mathematical proofs, its cost and time intensity limit scalable research. To address this, we built #ProofAutoGrader, an automatic grader for IMO-ProofBench. The autograder leverages Gemini 2.5 Pro, providing it with a
ProofAutoGrader: Automatic IMO Proof Evaluation Using Gemini
By
–
Leave a Reply