Predicting the next word "only" is sufficient for language models to learn a large body of knowledge that enables then to code, answer questions, understand many topics, chat, and so on.
— Nando de Freitas (@NandoDF) 29 mars 2024
This is clear to many researchers now, and there are nice tutorials on why this works by… pic.twitter.com/L95G6hWRNV
Predicting the next word "only" is sufficient for language models to learn a large body of knowledge that enables then to code, answer questions, understand many topics, chat, and so on. This is clear to many researchers now, and there are nice tutorials on why this works by