Sure. But why specifically for transformer layers and self-attention, not say convolutional or RNN layers?
Why Transformers and Self-Attention Over Convolutional or RNN Layers
By
–
Global AI News Aggregator
By
–
Sure. But why specifically for transformer layers and self-attention, not say convolutional or RNN layers?
Leave a Reply