RL post-training is hitting a rollout bottleneck. This new paper from #NVIDIAResearch shows how speculative decoding in NeMo-RL + @vllm_project can accelerate rollouts losslessly, with 1.8x higher throughput at 8B and projected 2.5x end-to-end speedup at 235B. Read the full
Speculative Decoding Accelerates RL Rollouts 2.5x in NeMo-RL
By
–
