AI Dynamics

Global AI News Aggregator

Training and Validation Loss Discrepancy in Single Epoch LLMs

If it's just one epoch, why would you expect an important difference between training loss and validation loss? (Unless there's some implicit practice I haven't heard of for LLMs, to use a validation set that isn't just a random holdout from training.)

→ View original post on X — @esyudkowsky,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *