Attention Is All You Need (2017): most cited ML paper of the decade.
— Sumanth (@Sumanth_077) 5 mai 2026
For 8 years, every frontier model has been built on quadratic attention. Process every possible word-to-word relationship. Compute explodes with context length. Accuracy degrades past 200k tokens.… https://t.co/tSJOH4A2Cu
Attention Is All You Need (2017): most cited ML paper of the decade. For 8 years, every frontier model has been built on quadratic attention. Process every possible word-to-word relationship. Compute explodes with context length. Accuracy degrades past 200k tokens.