Here is a good dive in on LLaMa underperforming on the EleutherAI harness versus the published number (TLDR is that we don't know yet which prompt they used for evaluation):
LLaMa Evaluation Discrepancy: ElutherAI Harness Results Analysis
By
–
Global AI News Aggregator
By
–
Here is a good dive in on LLaMa underperforming on the EleutherAI harness versus the published number (TLDR is that we don't know yet which prompt they used for evaluation):
Leave a Reply