Hertz-codec: > Convolutional audio VAE
> Encodes 16kHz mono speech to 8Hz latent representation at 1kbps
> 32-dim latent per 125ms frame
> Outperforms Soundstream and Encodec at 6kbps, on par with DAC at 8kbps
> 5M encoder, 95M decoder parameters
Hertz-codec: Advanced Convolutional Audio VAE Outperforms Competitors
By
–
Leave a Reply