AI Dynamics

Global AI News Aggregator

About

Multiple RL Trajectories Improve Agent Strategy Diversity

We have always been using 1 RL trajectory for training. Why not use more? In this research: Polychromic Objectives for RL, they train their model on sets of trajectories and reward both success & diversity, so the agent keeps MULTIPLE strategies alive all at once.

→ View original post on X — @askalphaxiv