AI Dynamics

Global AI News Aggregator

LLM Evaluation Methods and Gemini Pro Performance on LMSYS

Evaluation of LLMs is very hard and nuanced (especially academic evals which are leaked massively). Evals that rely on human judgement are far superior, so it feels good that Bard Gemini Pro (free tier) climbed pretty high on lmsys Looking forward to Gemini Ultra release!

→ View original post on X — @oriolvinyalsml,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *