Evaluation of LLMs is very hard and nuanced (especially academic evals which are leaked massively). Evals that rely on human judgement are far superior, so it feels good that Bard Gemini Pro (free tier) climbed pretty high on lmsys Looking forward to Gemini Ultra release!
LLM Evaluation Methods and Gemini Pro Performance on LMSYS
By
–
Leave a Reply