AI Dynamics

Global AI News Aggregator

GeLU, LayerNorm and Mathematical Operations in Neural Networks

A lot of fun in the Appendix, e.g. how GeLU can be used for multiplication / bypassing it as identity, use of LayerNorm for division, or bypassing that as identity, etc.

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *