AI Dynamics

Global AI News Aggregator

Multimodal Reasoning Emerges from Online RL Training

We also found that multimodal reasoning capabilities emerge naturally after online RL (even when RL is only done on math & code text data), as long as you start from an initial multimodal model.

→ View original post on X — @guillaumelample,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *