OpenAI releases multimodal model with text audio vision capabilities

AI Dynamics

Global AI News Aggregator

OpenAI releases multimodal model with text audio vision capabilities

–

13 May 2024 19h49

They are releasing a combined text-audio-vision model that processes all three modalities in one single neural network, which can then do real-time voice translation as a special case afterthought, if you ask it to.

(fixed it for you) https://t.co/0y36OId88h
— Andrej Karpathy (@karpathy) 13 mai 2024

They are releasing a combined text-audio-vision model that processes all three modalities in one single neural network, which can then do real-time voice translation as a special case afterthought, if you ask it to. (fixed it for you)

→ View original post on X — @karpathy,

13 May 2024

AI Dynamics

OpenAI releases multimodal model with text audio vision capabilities

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring