What if video generation could follow any semantic instruction without retraining or task-specific hacks? Enter Video-As-Prompt (VAP). By treating a reference video as an in-context semantic prompt and steering a frozen DiT with a plug-and-play MoT expert plus temporally
Video-As-Prompt: Semantic Video Generation Without Retraining
By
–
Leave a Reply