AI Dynamics

Global AI News Aggregator

RL Finetuning: Major Upgrade Over SFT for LLMs

I just mean long term, imo RL finetuning paradigm is a big upgrade over just SFT (expert imitation) for LLMs at the current stage of development and will continue to grow substantially.

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *