"Gemini Embedding 2" This paper turns Gemini into one native embedding model for text, image, video, audio, and interleaved multimodal inputs. Instead of converting everything into text first, it embeds raw modalities directly into one shared space, improving audio search,
Gemini Embedding 2: Native Multimodal Embedding Model
By
–
