one exciting observation about transformers (and most modern deep learning) is that you can understand them using high school math. really just multiplication, division, sums, and exponentiation, many times, and in a strange and initially hard-to-grok order
Understanding Transformers with High School Mathematics
By
–