A on "what was going on with the Open LLM Leaderboard?" its numbers didn't match the ones reported in LLaMA paper so we dived in it and wrote a blog post of learnings! Here's the thread version for those of you who didn't want to read a blog post
Open LLM Leaderboard evaluation accuracy analysis findings
By
–
Leave a Reply