Evaluating Language Models for Mathematics through Interactions paper page: https://
huggingface.co/papers/2306.01
694
… introduce CheckMate, an adaptable prototype platform for humans to interact with and evaluate LLMs. We conduct a study with CheckMate to evaluate three language
CheckMate: Interactive Platform for Evaluating Language Models in Mathematics
By
–
