AI Dynamics

Global AI News Aggregator

About

SOTA Multimodal Models Without ViT Vision Encoders

I was actually thinking lately if there is any SOTA multimodal that is not using ViT as vision encoder.

→ View original post on X — @jeande_d,