EMOVA
— AK (@_akhaliq) 27 septembre 2024
Empowering Language Models to See, Hear and Speak with Vivid Emotions
discuss: https://t.co/QifoJVP166
GPT-4o, an omni-modal model that enables vocal conversations with diverse emotions and tones, marks a milestone for omni-modal foundation models. However, empowering… pic.twitter.com/UBIT587Nml
EMOVA Empowering Language Models to See, Hear and Speak with Vivid Emotions discuss: https://
huggingface.co/papers/2409.18
042
… GPT-4o, an omni-modal model that enables vocal conversations with diverse emotions and tones, marks a milestone for omni-modal foundation models. However, empowering
