AI Dynamics

Global AI News Aggregator

Weak-to-Strong Generalization: Beyond RLHF for Superalignment

Naive weak supervision isn't enough—current techniques, like RLHF, won't be sufficient for future superhuman models. But we also show that it's feasible to drastically improve weak-to-strong generalization—making iterative empirical progress on a core challenge of superalignment

→ View original post on X — @openai,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *