Global AI News Aggregator
About
By
–
"A self-attention model is a linear MLP with dynamic weights" sounds about right
→ View original post on X — @rasbt,