Excited to share our new paper on sharp capacity scaling of the Muon optimizer! Joint work with @EshaanNichani Denny Wu @albertobietti @jasondeanlee: arxiv.org/abs/2603.26554 (1/7) [Translated from EN to English]
→ View original post on X — @berkeley_ai, 2026-03-30 04:41 UTC
Leave a Reply