AI Dynamics

Global AI News Aggregator

RL vs SL: Demystifying Reinforcement Learning’s Core Concept

Yeah exactly. I get triggered when RL is dressed up in its full rigorous math formalism because it's gate-keeping an essentially trivial core idea. SL:
a token sequence comes from some 3rd party source (e.g. human demonstration), and you just train on it. RL:
you first sample a

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *