#PapersAccepted by Jiqizhixin
Our report: https://
mp.weixin.qq.com/s/9TbUIT6ed_wO
viVU0GuLqg
… RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM Post-Training Peking University
Paper: https://
arxiv.org/abs/2510.00911
v1
…
Code: https://
github.com/RTkenny/RiskPO
RiskPO: Risk-Based Policy Optimization for LLM Post-Training
By
–
