Interesting new paper on online RL for agents. Most agent training still treats deployment and learning as separate phases. Serve the model first, collect data later, fine-tune offline. But every agent interaction already contains a learning signal. This paper introduces
Online Reinforcement Learning for Agents: New Research Approach
By
–
