i thought it was a hybrid, a game engine creates the camera controlled low-res, and a diffusion model upscales it (and produces the weird stuff).
camera rotations and translations are so notoriously hard in video-gen, that I thought some runtime trick was a great way to shortcut
Game Engine and Diffusion Model Hybrid for Video Generation
By
–
Leave a Reply