A model trained for next-token prediction is forced to build compressed representations of latent structure in text. Ilya Sutskever correctly refers to this phenomenon as understanding. Here, a model trained for next-step sensor prediction, with a robot that has proprioception… pic.twitter.com/rHh1nFjJxd
— Nando de Freitas (@NandoDF) 27 juin 2026
A model trained for next-token prediction is forced to build compressed representations of latent structure in text. Ilya Sutskever correctly refers to this phenomenon as understanding. Here, a model trained for next-step sensor prediction, with a robot that has proprioception