AI Dynamics

Global AI News Aggregator

DPO vs RLHF: Genuine Competition or Surface-Level Improvement?

Yeah exactly. I wonder if DPO genuinely competes with RLHF or if the models only looks good on the surface but are worse under closer inspection (like with imitation models)

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *