AI Dynamics

Global AI News Aggregator

New Scaling Laws for 350M Model Training Tokens

FACT: If you don't train your 350M model on 28T tokens, you're not optimal Nicholas Roberts (@nick11roberts) That new LFM2.5-350M is super overtrained, right? And everyone was shocked about how far they pushed it? As it turns out, we have a brand new scaling law for that! ๐Ÿงต [1/n] โ€” https://nitter.net/nick11roberts/status/2041141606305124486#m

โ†’ View original post on X โ€” @maximelabonne, 2026-04-06 15:05 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *