AI Dynamics

Global AI News Aggregator

About

Trilinear Attention Rewrites Transformer Scaling Laws

what if attention operated in 3D? This paper introduces trilinear (2-simplicial) attention, and it might have just rewrite the current transformer scaling law by squeezing out the same accuracy with far fewer tokens.

→ View original post on X — @askalphaxiv,