Hah, yeah, little known fun fact: Schmidhuber proposed an alternative to RNNs back in 1991, which is now called "linear Transformers" or "Transformers with linearized self-attention" via more recent papers. Summarized it here: https://
magazine.sebastianraschka.com/p/why-the-orig
inal-transformer-figure
…
Schmidhuber’s 1991 Alternative to RNNs: The Origins of Linear Transformers
By
–
Leave a Reply