Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation propose Make-an-Audio 2, a latent diffusion-based T2A method that builds on the success of Make-an-Audio. Our approach includes several techniques to improve semantic alignment and temporal consistency: Firstly, we use
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
By
–
Leave a Reply