AI Dynamics

Global AI News Aggregator

Reinforcement Learning Improves with Stronger Base Models Like GPT-4o

Reinforcement learning works better when using stronger base models. In their recent post, SemiAnalysis stated that o1 and o3 were trained with GPT-4o as the base, and the respective 'mini' versions were distillations of their larger models.

→ View original post on X — @petergostev,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *