AI Dynamics

Global AI News Aggregator

About

Microsoft MAI Guide: 1T Model, 35B Active, No Synthetic Data

Fantastic in depth guide about Microsoft MAI by @eliebakouch tl;dr about the model: Respect where respect is due. -zero synthetic data or distillation from previous models.
-1T model with 35B active, trained on 33.5T tokens

→ View original post on X — @kimmonismus