16/ HELM: Now let's take a look at the HELM implementation. The few-shot prompt is similar but the way the model is evaluated is quite different: we use the next token probabilities from the model to select a text generation and we compare it to the text of the expected answer
HELM Implementation: Model Evaluation via Token Probabilities
By
–
Leave a Reply