Oh yes it's the nice "Augmenting Self-attention with Persistent Memory" – section 4 – "Feedforward sublayer as an attention layer" 🙂
By
–
Oh yes it's the nice "Augmenting Self-attention with Persistent Memory" – section 4 – "Feedforward sublayer as an attention layer" 🙂