A lot of fun in the Appendix, e.g. how GeLU can be used for multiplication / bypassing it as identity, use of LayerNorm for division, or bypassing that as identity, etc.
GeLU, LayerNorm and Mathematical Operations in Neural Networks
By
–
Global AI News Aggregator
By
–
A lot of fun in the Appendix, e.g. how GeLU can be used for multiplication / bypassing it as identity, use of LayerNorm for division, or bypassing that as identity, etc.
Leave a Reply