We say “Transformers predict the next word,” but really what’s happening is: They’re reshaping a whole field of relationships until the pattern feels stable – like a marble rolling down until it rests in the lowest valley. That “valley” is the lowest-entropy shape of language –
Transformers Shape Language Patterns Like Marbles
By
–