AI Dynamics

Global AI News Aggregator

About

SWEET-RL: Novel Algorithm for Long-Horizon Multi-Turn Tasks

As part of this work, we’re also releasing SWEET-RL, a novel RL algorithm for long-horizon & multi-turn tasks which can perform better credit assignments. Our experiments demonstrate that SWEET-RL achieves a 6% absolute improvement in success & win rates on

→ View original post on X — @aiatmeta