3/ ImageBind – an approach that learns a joint embedding data across 6 modalities at once; extends zero-shot capabilities to new modalities and enables emergent applications including cross-modal retrieval, composing modalities, and more.
ImageBind: Joint Embedding Across Six Modalities
By
–