AI Dynamics

Global AI News Aggregator

About

Meituan Longcat Introduces Asynchronous RL for LLM Training

Asynchronous RL for LLM training. Meituan Longcat fixes the rollout bottleneck from long reasoning traces by keeping multiple policy versions alive at once. Long trajectories can now stay on their original policy, so training can keep moving without dropping samples or breaking

→ View original post on X — @askalphaxiv,