When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards Alzahrani et al.: https://
arxiv.org/abs/2402.01781 #Artificialintelligence #DeepLearning #MachineLearning
LLM Leaderboards: Benchmarks as Targets Reveal Sensitivity
By
–
