This was the Massive Multitask Language Understanding test (Humanities, STEM, Social Sciences, etc) When finetuned for the test, LLaMA omitted GPT's superior results from the chart because it is not of "moderate size"
LLaMA omits GPT’s results from MMLU test chart due to size.
By
–
