AI Dynamics

Global AI News Aggregator

About

Policy Gradient for Diffusion Model Fine-tuning

It turns out we can easily apply the material we’ve covered to get the policy gradient for fine tuning a diffusion model, see eg https://
arxiv.org/pdf/2305.13301 It becomes a multi step RL problem with the reward only happening at the end. It’s not very efficient I think, but I’d love to

→ View original post on X — @nandodf