They also suggested GPT-4.1 might be the base for the next generation reasoner, o4, expected to release soon. GPT-4.1 is significantly stronger than GPT-4o. On the GPQA benchmark, GPT-4.1 scores substantially higher (+18 pts).
@petergostev
-

Reinforcement Learning Improves with Stronger Base Models Like GPT-4o
By
–
Reinforcement learning works better when using stronger base models. In their recent post, SemiAnalysis stated that o1 and o3 were trained with GPT-4o as the base, and the respective 'mini' versions were distillations of their larger models.