> autoregressive latent diffusion model
— Vaibhav (VB) Srivastav (@reach_vb) 4 décembre 2024
> trained on large video datasets
> latent frames pass through an autoencoder to a transformer dynamics model
> uses a causal mask similar to LLMs
> inference involves frame-by-frame autoregressive sampling with past frames
>… https://t.co/bLiWmy1IDS pic.twitter.com/7V3vsiC6c1
> autoregressive latent diffusion model > trained on large video datasets
> latent frames pass through an autoencoder to a transformer dynamics model
> uses a causal mask similar to LLMs
> inference involves frame-by-frame autoregressive sampling with past frames
>
