Clone any voice with just a 3-second audio clip! Mistral just open sourced Voxtral TTS, a text-to-speech model that clones voices from 3 seconds of audio and runs on edge devices. Here's what makes it different. Most TTS models need cloud GPUs and long audio samples. Voxtral
Mistral Open Sources Voxtral TTS Voice Cloning Model
By
–
