AI Dynamics

Global AI News Aggregator

O3 Performance Shows Rapid Progress in RL Chain-of-Thought Scaling

o3 is very performant. More importantly, progress from o1 to o3 was only three months, which shows how fast progress will be in the new paradigm of RL on chain of thought to scale inference compute. Way faster than pretraining paradigm of new model every 1-2 years

→ View original post on X — @_jasonwei,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *