there’s a palpable tension in the air as hundreds of AI researchers (including me!) quietly work nights and weekends trying to figure out the “right way” to scale RL math & code are not the universe we will not rest until post-training is as clean and elegant as pre-training
AI Researchers Race to Perfect Reinforcement Learning Scaling
By
–
Leave a Reply