LLaMA-65B outperforms Minerva-62B on GSM8k, even though it has not been fine-tuned on any mathematical dataset. On the MATH benchmark, it outperforms PaLM-62B (but is quite below Minerva-62B)
5/n
LLaMA-65B Outperforms Minerva-62B on GSM8k Without Mathematical Fine-tuning
By
–
Leave a Reply