Tencent released SRPO on Hugging Face Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference By fine-tuning the FLUX1dev model with optimized denoising and online reward adjustment, improve its human-evaluated realism and aesthetic quality by over 3x
Tencent SRPO: Aligning Diffusion Models with Human Preferences
By
–
Leave a Reply