AI Dynamics

Global AI News Aggregator

Training stability control with loss and gradient norm thresholding

Cool! For the spike I'd try e.g. `-sl 7 -sg 7` to keep instability in check earlier in the training. (will skip update if loss/gradnorm > 7 sigma outlier is detected)

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *