I'm releasing 53 slides on post-training, covering core algorithms like DPO and GRPO, as well as data quality, synthetic data pipelines, and on-policy training. I had the pleasure of presenting it yesterday as a guest lecturer in Cambridge, UK
→ View original post on X — @maximelabonne, 2026-03-06 10:45 UTC

Leave a Reply