Ofcourse. It depends on what do you optimise for SpeechT5 TTS (AR) does sound better than vanilla FastSpeech2 (NAR) however worse than VITS/PortaSpeech. Significantly worse than TorToise TTS. However it is considerably slow.
SpeechT5 vs FastSpeech2 vs VITS: TTS Model Comparison
By
–
Leave a Reply