AI Dynamics

Global AI News Aggregator

About

Quadratic Attention Limits Frontier Models After 8 Years

Attention Is All You Need (2017): most cited ML paper of the decade. For 8 years, every frontier model has been built on quadratic attention. Process every possible word-to-word relationship. Compute explodes with context length. Accuracy degrades past 200k tokens.

→ View original post on X — @sumanth_077,