Issues related to the tokenizer and chat template are the main reasons why merged models underperformed on the Open LLM Leaderboard v2. The evals don't change anything. A model merged to maximize MMLU will also perform well on these new evals.
Tokenizer and Chat Template Issues Affect Merged Model Performance
By
–
