AI Dynamics

Global AI News Aggregator

About

Synthetic Data Training Limitations and Mode Collapse in Self-Distillation

Training on pure synthetic data has no information gain, thus there is little reason the model *should* improve. Oftentimes when evals go up from “self-distillation”, that might be from some more invisible tradeoff, i.e. mode collapse in exchange for individual eval improvement

→ View original post on X — @alexandr_wang