AI Dynamics

Global AI News Aggregator

About

GeLU, LayerNorm and Mathematical Operations in Neural Networks

A lot of fun in the Appendix, e.g. how GeLU can be used for multiplication / bypassing it as identity, use of LayerNorm for division, or bypassing that as identity, etc.

→ View original post on X — @karpathy