Oh, and sorry, @DrJimFan
, even the 2014 paper is not the original attention paper. It goes actually back to 1991 (via @SchmidhuberAI
) in https://
semanticscholar.org/paper/Learning
-to-Control-Fast-Weight-Memories-An-to-Schmidhuber/bc22e87a26d020215afe91c751e5bdaddd8e4922
… I've summarized it here: https://
magazine.sebastianraschka.com/p/why-the-orig
inal-transformer-figure
…
Attention Mechanism Origins Traced Back to 1991 Schmidhuber
By
–
Leave a Reply