AI Dynamics

Global AI News Aggregator

DAGGER vs RL: Feedback Methods for LLM Training

While DAGGER is a great idea to enable Feedback for LLMs (eg chat) it is not a replacement for RL because RL opens up room for different forms of feedback (eg preferences). However, as a teacher I would advise careful measurement of the contribution of each to the final metric.

→ View original post on X — @nandodf,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *