"Image Generators are Generalist Vision Learners" This paper argues that image generation training acts like vision pretraining. They use light instruction-tuning to turn a pretrained image generator into a generalist vision model. With the key idea being representing vision
Image Generators as Vision Foundation Models Through Instruction Tuning
By
–
