AI Dynamics

Global AI News Aggregator

Transformers’ Interpolative Architecture and Limitations for Symbolic Tasks

Ironically, Transformers are even worse in that regard — mostly due to their strongly interpolative architecture prior. Multi-head-attention literally hardcodes sample interpolation in latent space. Also, the fact that recurrence is a really helpful prior for symbolic programs.

→ View original post on X — @fchollet,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *