AI Dynamics

Global AI News Aggregator

Loss spikes during training were rare and brief

We indeed had a handful of loss spikes, but these were very rare (maybe less than 5 or 10 over the entire training) and never lasted more than a couple of iterations, so we didn't have to do anything like skipping batches or lowering learning rate.

→ View original post on X — @guillaumelample,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *