LLaMA 65B MMLU Benchmark Discrepancy Analysis

AI Dynamics

Global AI News Aggregator

LLaMA 65B MMLU Benchmark Discrepancy Analysis

–

26 June 2023 15h43

2/ For one evaluation, MMLU (
https://
arxiv.org/abs/2009.03300), the community was surprised that the leaderboard numbers for the top model, LLaMA 65B, were significantly lower than the numbers in the published LLaMa paper: a 30% difference! We dived in a rabbit hole to understand

→ View original post on X — @thom_wolf,

26 June 2023

AI Dynamics

LLaMA 65B MMLU Benchmark Discrepancy Analysis

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring