AI Dynamics

Global AI News Aggregator

Early RLHF Paper: From Supervised Fine-tuning to Personality in Language Models

[Slides] This is one of the earliest papers on RLHF (if not the first, alongside InstructGPT). Before RLHF, language models didn’t really have personalities—they mostly relied on supervised fine-tuning or clever prompting to understand humans. Think back to the InstructGPT days.

→ View original post on X — @jeande_d,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *