AI Dynamics

Global AI News Aggregator

Hyperparameter tuning: logarithmic scaling for better sweep design

Sometimes I see papers with hyperparameter sweeps over 0.001, 0.003, 0.006, 0.01, etc. Many hyperparameters are better expressed in negative integral log2. Small values like learning rates directly, and values close to 1 like EMA factors and TD lambda / gamma with 1-2**val. It

→ View original post on X — @id_aa_carmack,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *