Running inference on an assortment of FastSpeech2 variants with a new vocoder.. after 3 hours of sifting through the stack trace, realised that I forgot to `unsqueeze` my input vector! *cries in torch*
@reach_vb
-
Coqui AI Open Source Text-to-Speech Solution Gains Recognition
By
–
Brilliant! Open source Text-to-Speech ftw! Kudos @coqui_ai
-
Sharing AI Models on Hub Platform
By
–
Fantastic!! Would be cool to get the models on the hub too! Happy to help with it, if you want!
-
Improving Whisper Transcription Quality with Contrastive Search Strategy
By
–
In my experience those results were quite suboptimal and didn’t quite result in usable transcriptions. With better decoding strategy those issues can be alleviated a bit. So the hack is more so of using contrastive search with Whisper to enable such use cases. Will run more
-
Fine-Tuning Whisper Model for Improved Performance
By
–
Definitely yes! You can fine-tune Whisper to boost the model performance: https://
huggingface.co/blog/fine-tune
-whisper
… -
Contrastive Search Outperforms Greedy Search in Text Generation
By
–
Interesting! Can you try running the next cell with contrastive search? In my experience greedy search doesn’t perform as well! Would be curious to see the results 🙂
-
AI Transcription Accuracy and Generation Strategy Optimization
By
–
Yess! Quite possible – although the accuracy of the transcriptions still needs to be checked. I’ve found it to loose context if not used with the right generation strategy!
-
Contrastive Search Benchmarks Coming Soon for Developers
By
–
Definitely! More benchmarks on contrastive search coming soon too Thanks for empowering millions of developers
-
Contrastive Search vs Greedy Search for LLM Generation
By
–
Yeh! Would deffo recommend contrastive search. It works quite well (however, over generates sometime ) Greedy search simply results in lost context.
-
Fine-tuned Models Performance on Multilingual Translation Tasks
By
–
Agreed! From my experience, it works well on languages that were abundant in the train set and also had lang -> en translate pairs. I’ll run some experiments to see how well do fine-tuned models perform. Quite lovely to see ya already using this