AI Dynamics

Global AI News Aggregator

Scaling Learning Rate with Batch Size in Model Training

I think it really depends on the batch size you are using. Let's say you are multiplying the batch size by x, then you should multiply the learning rate by sqrt(x)

→ View original post on X — @skirano,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *