AI Dynamics

Global AI News Aggregator

About

DSA: Learning-Based Token Selection Beyond Fixed Window Attention

Also, DSA is basically a smarter version of sliding window attention where you "learn" which past tokens to select versus forcing it to be in a specific window

→ View original post on X — @rasbt