New Scaling Laws for 350M Model Training Tokens

AI Dynamics

Global AI News Aggregator

New Scaling Laws for 350M Model Training Tokens

–

06 April 2026 17h05

FACT: If you don't train your 350M model on 28T tokens, you're not optimal Nicholas Roberts (@nick11roberts) That new LFM2.5-350M is super overtrained, right? And everyone was shocked about how far they pushed it? As it turns out, we have a brand new scaling law for that! 🧵 [1/n] — https://nitter.net/nick11roberts/status/2041141606305124486#m

→ View original post on X — @maximelabonne, 2026-04-06 15:05 UTC

6 April 2026

AI Dynamics

New Scaling Laws for 350M Model Training Tokens

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Cheaper exploration at scale remains advantageous despite no new exploits

Gold Status Experience Brings Satisfaction

Using ChatGPT for Essay Feedback and Improvement

Intelligence Gone Wrong: Cheating Despite Having Correct Answer