AI Dynamics

Global AI News Aggregator

About

Toxic Data Improves LLM Post-Training Control and Separability

"Bad Data, Good Models?" A surprising take on LLM pretraining. This paper flips the script: pretraining with more toxic data can actually improve post-training control. Using Olmo-1B variants, the authors show that toxicity becomes more linearly separable—making

→ View original post on X — @jiqizhixin