Probably the first product Thinky will build is a full panel of dials that researchers can use to physically adjust all the hparams during training. We gonna do hardware one day and it is the time 😂 https://t.co/PXN7BDnPOF
— Lilian Weng (@lilianweng) 25 mai 2025
Probably the first product Thinky will build is a full panel of dials that researchers can use to physically adjust all the hparams during training. We gonna do hardware one day and it is the time 😂 Stephen Roller (@stephenroller) Some teams use sweeps, heuristics, or scaling laws to determine their training LR. At Character, we just have Noam Shazeer dial it to the right value. — https://nitter.net/stephenroller/status/1801436697449648249#m
→ View original post on X — @lilianweng, 2025-05-25 04:08 UTC
Leave a Reply