A deep learning paper from our team on how to build multimodal models, where data for individual modalities arrives at different speeds, by combining powerful pretrained text, image and video models. Useful for robotics. Congrats @konradzolna @serkancabi @yutianc Eric, Claudio,… https://t.co/vPmZpDkve3
— Nando de Freitas (@NandoDF) 17 janvier 2024
A deep learning paper from our team on how to build multimodal models, where data for individual modalities arrives at different speeds, by combining powerful pretrained text, image and video models. Useful for robotics. Congrats @konradzolna @serkancabi @yutianc Eric, Claudio,
Leave a Reply