(11/12) Vid2seq: A Pretrained Visual Language Model for Describing Multi-Event Videos
— Cohere (@cohere) 6 avril 2023
Authors: Antoine Yang, @NagraniArsha, Paul Hongsuck Seo, @antoine77340, @jponttuset, Ivan Laptev, Josef Sivic, @CordeliaSchmid pic.twitter.com/ZxPgJCxTIx
(11/12) Vid2seq: A Pretrained Visual Language Model for Describing Multi-Event Videos
Authors: Antoine Yang, @NagraniArsha
, Paul Hongsuck Seo, @antoine77340
, @jponttuset
, Ivan Laptev, Josef Sivic, @CordeliaSchmid
