AI Dynamics

Global AI News Aggregator

About

Experimenting with Olmo 3 and DeepSeek V3.2 GRPO tweaks

Oh yeah you probably will. I am currently running tons of experiments to decide which of the Olmo 3 and DeepSeek V3.2 GRPO tweak I should add in the final version.
No side objective really except I'd do an honest evaluation to make sure that the model actually learns well.

→ View original post on X — @rasbt