AI Dynamics

Global AI News Aggregator

LLaMa Evaluation Discrepancy: ElutherAI Harness Results Analysis

Here is a good dive in on LLaMa underperforming on the EleutherAI harness versus the published number (TLDR is that we don't know yet which prompt they used for evaluation):

→ View original post on X — @thom_wolf,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *