it is obvious that the @deepfloydai team have made an earth-shattering breakthrough in text-to-image-with-text that will melt the face of visual graphics designers. do not sleep on this. it is going to be as big a leap as going from the shitty GAN era to @StableDiffusion itself.
MULTIMODAL AI
-
Generating Kanji Images from English Text
By
–
A related experiment, generating “Kanji” images from English text:
-
Effective Concepts for Text-to-Image Models and Latent Space Exploration
By
–
“Tardigrade” works really well on most text-to-image models. There are so many images of them. Other concepts that work well involve insects, animals, plants. Food (pizza, noodles). Historical events (WW2). Medical (X-rays). The latent space is full of exploration opportunities.
-
Neural Networks Enhance Character Learning with Visual Mnemonics
By
–
Check out this 2021 experiment by @azlenelza where characters were augmented with visual mnemonics with the help of neural networks, to make them easier to learn:
-
VALL-E Text-to-Speech Synthesis: Audio and Video Generation
By
–
How about generating audio and video? Something that we saw with VALL-E (Text-to-speech-synthesis): https://
valle-demo.github.io -
Multimodal AI: The Magic Behind Tomorrow’s Technology
By
–
Sounds like magic right? But that's what multimodal AI will enable!!
-
GPT-4 Multimodality: What Does It Really Mean?
By
–
GPT-4 will be Multimodal Very high chance that multimodality will lead the next generation of AI models. But, What does that really mean?
-
Stable Diffusion Generates Music from Image Training Data
By
–
It turns out that music-to-images also counts as a valid transformation – you can train the raw, unmodified @StableDiffusion on IMAGES OF MUSIC
— swyx 🐣 (@swyx) 13 janvier 2023
(and generate music better than AI specifically made to generate music!)https://t.co/bWm4dfezYnIt turns out that music-to-images also counts as a valid transformation – you can train the raw, unmodified @StableDiffusion on IMAGES OF MUSIC (and generate music better than AI specifically made to generate music!)
-
Recognition Systems Computer Vision ImageNet AI Glossary
By
–
In this @Cognilytica #AIToday #podcast AI Glossary Series episode 'Recognition Systems, Computer Vision, ImageNet' hosts @rschmelzer & @kath0134 define these terms, discuss how they're related, & why it’s important to understand them. Full episode: https://
cognilytica.com/2023/01/11/ai-
today-podcast-ai-glossary-series-recognition-systems-computer-vision-imagenet/?utm_source=dlvr.it&utm_medium=twitter
…
#CV #ML -
Abacus AI Launches Foundation Models for Language Vision Speech
By
–
You can now integrate generative models into your system using Abacus AI! We have m foundational models: language, vision, and speech. You can fine-tune them and start using them right away! https://
abacus.ai/foundation_mod
els
…