AI Dynamics

Global AI News Aggregator

About

Meta-RL with Rubric-guided Policy Decomposition Research

RubricEM Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

→ View original post on X — @_akhaliq