AI Dynamics

Global AI News Aggregator

About

Generalization of SFT: Reinforcement Learning with Reward Rectification

On the Generalization of SFT A Reinforcement Learning Perspective with Reward Rectification

→ View original post on X — @_akhaliq