They use "optimal" in a relative sense. I read it in a reply to my post, and then subsequently looked it up in LLM search engine—which happily told me why "Chinchilla is inference optimal because […]". But that's only relatively to undertrained models.
Optimal in Relative Terms: Chinchilla and Model Training
By
–
Leave a Reply