ROSCOE is a first-of-its-kind suite of metrics for scoring step-by-step reasoning. We hope this work provides a foundation that enables scalable systematic evaluation and benchmarking of new language models. Read the paper https://
bit.ly/3XxYAgU
ROSCOE: New Metrics Suite for Evaluating Step-by-Step Reasoning
By
–
Leave a Reply