The text and image encoders within Embed 3 share a unified latent space, significantly reducing complexity for users. Our model is also uniquely performant at mixed modality searches.
Embed 3 Unifies Text Image Encoders for Mixed Modality Search
By
–
