AI sees things that humans can't
MULTIMODAL AI
-
AI-Powered Glasses Enable Deaf People to Read Conversations
By
–
AMAZING: Artificial intelligence-powered glasses let deaf people read conversations. And users can even rewind pictures. #AIForGood #iot #AI #ML #DX #deaf #ArtificialIntelligence pic.twitter.com/JoM7aFD5LV
— Sean Gardner (@2morrowknight) 2 décembre 2022AMAZING: Artificial intelligence-powered glasses let deaf people read conversations. And users can even rewind pictures. #AIForGood #iot #AI #ML #DX #deaf #ArtificialIntelligence
-
RoentGen: Vision-Language Model for Chest X-rays
By
–
A later work, "RoentGen: Vision-Language Foundation Model for Chest X-ray Generation", was also conducted by Stanford's Department of Radiology, in collaboration with Tanishq Abraham (@iScienceLuvr) at @StabilityAI. https://stanfordmimi.github.io/RoentGen/
-
Stanford researchers generate synthetic chest X-rays with Stable Diffusion
By
–
Stanford researchers create synthetic yet realistic chest X-rays using #StableDiffusion SD is typically used for art, not for science. In this case, the “radiographs” might be high quality enough to complement real image datasets. https://
arxiv.org/abs/2210.04133 -
RA-CM3 Model Outperforms Baselines with 70% Less Compute
By
–
The RA-CM3 model significantly outperforms baseline multimodal models on both image & caption generation tasks, while using <30% of the compute of comparable models & exhibiting novel capabilities like knowledge-intensive image generation & multimodal in-context learning. 2/4
-
Meta AI Introduces RA-CM3: Multimodal Model for Text and Image Generation
By
–
New paper from our team at Meta AI. Retrieval-Augmented CM3 (RA-CM3) is the first multimodal model that can retrieve and generate mixtures of text and images — while also reducing training cost and model size. Now available on arXiv https://
arxiv.org/abs/2211.12561 1/4 -
Language Interfaces: The Future of Human-Computer Interaction
By
–
language interfaces are going to be a big deal, i think. talk to the computer (voice or text) and get what you want, for increasingly complex definitions of "want"! this is an early demo of what's possible (still a lot of limitations–it's very much a research release).
-
TorchMultimodal: PyTorch Library for Multimodal Multi-Task Models
By
–
TorchMultimodal is a @PyTorch library for training state-of-the-art multimodal multi-task models at scale. You can learn more and get started here https://
bit.ly/3XalYlA 6/6 -
AI Transforms Child Drawings Into Realistic Images
By
–
Bring a child's drawing to life in this demo. By teaching AI to work effectively with this quintessential human form of creativity, we hope this project will move us closer to building AI that can understand the world from a human PoV https://
sketch.metademolab.com 4/6 -
Make-A-Video: Text-to-Video Generation Research
By
–
Make-A-Video research builds on the progress made in text-to-image generation technology to enable text-to-video generation.
— AI at Meta (@AIatMeta) 29 novembre 2022
You can see examples of the videos here ➡️ https://t.co/g8JLL8xvLh
3/6 pic.twitter.com/J2oBu7oll2Make-A-Video research builds on the progress made in text-to-image generation technology to enable text-to-video generation. You can see examples of the videos here https://
makeavideo.studio 3/6