Wow! New Speech to Speech model – Fish Agent v0.1 3B by @FishAudio π₯
— Vaibhav (VB) Srivastav (@reach_vb) 5 novembre 2024
> Trained on 700K hours of multilingual audio
> Continue-pretrained version of Qwen-2.5-3B-Instruct for 200B audio & text tokens
> Zero-shot voice cloning
> Text + audio input/ Audio output
> Ultra-fast⦠pic.twitter.com/UvdwxGUm4w
Wow! New Speech to Speech model – Fish Agent v0.1 3B by @FishAudio > Trained on 700K hours of multilingual audio
> Continue-pretrained version of Qwen-2.5-3B-Instruct for 200B audio & text tokens
> Zero-shot voice cloning
> Text + audio input/ Audio output
> Ultra-fast
Leave a Reply