AI Dynamics

Global AI News Aggregator

System Card Analysis: Reasoning Implementation via RL Toggle

What a week! Just read the system card, and it looks like they implemented reasoning via RL. My guess is the thinking on/off toggle is likely a system prompt. I wonder if they added inference-time scaling like o1 or if it’s just RL like R1. Anyone found details on that?

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *