AI Dynamics

Global AI News Aggregator

About

Vid2seq: Pretrained Visual Language Model for Video Description

(11/12) Vid2seq: A Pretrained Visual Language Model for Describing Multi-Event Videos
Authors: Antoine Yang, @NagraniArsha
, Paul Hongsuck Seo, @antoine77340
, @jponttuset
, Ivan Laptev, Josef Sivic, @CordeliaSchmid

→ View original post on X — @cohere