I think it really depends on the batch size you are using. Let's say you are multiplying the batch size by x, then you should multiply the learning rate by sqrt(x)
Scaling Learning Rate with Batch Size in Model Training
By
–
Global AI News Aggregator
By
–
I think it really depends on the batch size you are using. Let's say you are multiplying the batch size by x, then you should multiply the learning rate by sqrt(x)
Leave a Reply