AI Dynamics

Global AI News Aggregator

Dropout Layers Leak Training Phase Information in Transformers

Dropout layers in a Transformer leak the phase bit (train/eval) – small example. So an LLM may be able to determine if it is being trained and if backward pass follows. Clear intuitively but good to see, and interesting to think through repercussions of

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *