ProRL Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
ProRL: Reinforcement Learning for Proactive Recommendations
By
–

By
–

ProRL Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation