Training a Helpful and Harmless Assistant with
Reinforcement Learning from Human Feedback slides: https://
docs.google.com/presentation/d
/1hYPWiLETSK5r_Y6sU0mWbOyAz-uevNJh3Xp6BmUnGHM/edit?usp=sharing
… paper: https://
arxiv.org/abs/2204.05862 instructgpt:
Training Helpful Harmless Assistant with Reinforcement Learning from Human Feedback
By
–
Leave a Reply