AI Dynamics

Global AI News Aggregator

About

Multimodal models struggle with transfer learning to text despite image knowledge.

Also weird there’s no obvious transfer learning back to text. For all the ineffable, AGI-essential knowledge supposedly in images, multimodal models seem no better at spatial reasoning word problems, creating SVGs, designing web UI, or drawing ASCII art.

→ View original post on X — @goodside