“Continuous Latent Diffusion Language Model” Most diffusion language models still use diffusion to recover token-like states, just in a different generation order. However, this paper uses diffusion in a different way. It learns a continuous latent prior for global semantics
