Training Loss Noise and Validation Loss Smoothing Explained

AI Dynamics

Global AI News Aggregator

Training Loss Noise and Validation Loss Smoothing Explained

–

28 May 2024 18h25

Training loss is evaluated over the batch, i.e. 0.5M tokens. It's noisy but this is expected, you could be iterating through easy or hard documents in the training data. The validation loss is averaged over 20 batches of 0.5M tokens (this is a hyperparameter), so it is smoother.

→ View original post on X — @karpathy,

28 May 2024

AI Dynamics

Training Loss Noise and Validation Loss Smoothing Explained

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Cheaper exploration at scale remains advantageous despite no new exploits

Gold Status Experience Brings Satisfaction

Using ChatGPT for Essay Feedback and Improvement

Intelligence Gone Wrong: Cheating Despite Having Correct Answer