AI Dynamics

Global AI News Aggregator

MULTIMODAL AI

DeepFloyd AI’s Text-to-Image Breakthrough Will Transform Graphic Design

By

@swyx

–

15 January 2023 14h19

it is obvious that the @deepfloydai team have made an earth-shattering breakthrough in text-to-image-with-text that will melt the face of visual graphics designers. do not sleep on this. it is going to be as big a leap as going from the shitty GAN era to @StableDiffusion itself.

→ View original post on X — @swyx,

15 January 2023
Generating Kanji Images from English Text

By

AI Dynamics

–

14 January 2023 17h42

A related experiment, generating “Kanji” images from English text:

→ View original post on X — @hardmaru,

14 January 2023
Effective Concepts for Text-to-Image Models and Latent Space Exploration

By

AI Dynamics

–

14 January 2023 10h53

“Tardigrade” works really well on most text-to-image models. There are so many images of them. Other concepts that work well involve insects, animals, plants. Food (pizza, noodles). Historical events (WW2). Medical (X-rays). The latent space is full of exploration opportunities.

→ View original post on X — @hardmaru,

14 January 2023
Neural Networks Enhance Character Learning with Visual Mnemonics

By

AI Dynamics

–

14 January 2023 9h58

Check out this 2021 experiment by @azlenelza where characters were augmented with visual mnemonics with the help of neural networks, to make them easier to learn:

→ View original post on X — @hardmaru,

14 January 2023
VALL-E Text-to-Speech Synthesis: Audio and Video Generation

By

@saboo_shubham_

–

14 January 2023 7h28

How about generating audio and video? Something that we saw with VALL-E (Text-to-speech-synthesis): https://
valle-demo.github.io

→ View original post on X — @saboo_shubham_,

14 January 2023
Multimodal AI: The Magic Behind Tomorrow’s Technology

By

@saboo_shubham_

–

14 January 2023 7h11

Sounds like magic right? But that's what multimodal AI will enable!!

→ View original post on X — @saboo_shubham_,

14 January 2023
GPT-4 Multimodality: What Does It Really Mean?

By

@saboo_shubham_

–

14 January 2023 7h04

GPT-4 will be Multimodal Very high chance that multimodality will lead the next generation of AI models. But, What does that really mean?

→ View original post on X — @saboo_shubham_,

14 January 2023
Stable Diffusion Generates Music from Image Training Data

By

@swyx

–

13 January 2023 17h11

It turns out that music-to-images also counts as a valid transformation – you can train the raw, unmodified @StableDiffusion on IMAGES OF MUSIC

(and generate music better than AI specifically made to generate music!)https://t.co/bWm4dfezYn
— swyx 🐣 (@swyx) 13 janvier 2023

It turns out that music-to-images also counts as a valid transformation – you can train the raw, unmodified @StableDiffusion on IMAGES OF MUSIC (and generate music better than AI specifically made to generate music!)

→ View original post on X — @swyx,

13 January 2023
Recognition Systems Computer Vision ImageNet AI Glossary

By

@rschmelzer

–

13 January 2023 16h01

In this @Cognilytica #AIToday #podcast AI Glossary Series episode 'Recognition Systems, Computer Vision, ImageNet' hosts @rschmelzer & @kath0134 define these terms, discuss how they're related, & why it’s important to understand them. Full episode: https://
cognilytica.com/2023/01/11/ai-
today-podcast-ai-glossary-series-recognition-systems-computer-vision-imagenet/?utm_source=dlvr.it&utm_medium=twitter
…
#CV #ML

→ View original post on X — @rschmelzer,

13 January 2023
Abacus AI Launches Foundation Models for Language Vision Speech

By

AI Dynamics

–

12 January 2023 23h00

You can now integrate generative models into your system using Abacus AI! We have m foundational models: language, vision, and speech. You can fine-tune them and start using them right away! https://
abacus.ai/foundation_mod
els
…

→ View original post on X — @abacusai,

12 January 2023