AI Dynamics

Global AI News Aggregator

@reach_vb

Open Artifacts Required for Blog Post Coverage

By

@reach_vb

–

04 December 2024 16h56

i'll take blogposts as long as they come with open artefacts on the hub

→ View original post on X — @reach_vb,

4 December 2024
User anticipates Whisper speech recognition model refresh

By

@reach_vb

–

04 December 2024 16h52

i was hoping for another whisper refresh actually

→ View original post on X — @reach_vb,

4 December 2024
Smaller Multilingual Models Under 1B Parameters

By

@reach_vb

–

04 December 2024 16h32

go even smaller than 2B & multilingual – sub 1B is perfect size!

→ View original post on X — @reach_vb,

4 December 2024
Autoregressive Latent Diffusion Model for Video Generation

By

@reach_vb

–

04 December 2024 15h56

> autoregressive latent diffusion model
> trained on large video datasets
> latent frames pass through an autoencoder to a transformer dynamics model
> uses a causal mask similar to LLMs
> inference involves frame-by-frame autoregressive sampling with past frames
>… https://t.co/bLiWmy1IDS pic.twitter.com/7V3vsiC6c1
— Vaibhav (VB) Srivastav (@reach_vb) 4 décembre 2024

> autoregressive latent diffusion model > trained on large video datasets
> latent frames pass through an autoencoder to a transformer dynamics model
> uses a causal mask similar to LLMs
> inference involves frame-by-frame autoregressive sampling with past frames
>

→ View original post on X — @reach_vb,

4 December 2024
DeepMind Genie 2: Multimodal World Model Generates 3D Environments

By

@reach_vb

–

04 December 2024 15h47

DeepMind COOKED! Genie 2, a large-scale, multi-modal foundation world model! 🔥

Capable of creating endless action-controllable, playable 3D environments – the future is going to be so, so wild! pic.twitter.com/BRi2djkarm
— Vaibhav (VB) Srivastav (@reach_vb) 4 décembre 2024

DeepMind COOKED! Genie 2, a large-scale, multi-modal foundation world model! Capable of creating endless action-controllable, playable 3D environments – the future is going to be so, so wild!

→ View original post on X — @reach_vb,

4 December 2024
Model Distribution Evolution: Stable Diffusion to Llama 3.1

By

@reach_vb

–

04 December 2024 15h26

Wild how the distribution of models changes so, soo much over the two years!

We went from Stable Diffusion v1.4 to Mixtral to Llama 3.1 8B 🔥 pic.twitter.com/yjpo8luK3X
— Vaibhav (VB) Srivastav (@reach_vb) 4 décembre 2024

Wild how the distribution of models changes so, soo much over the two years! We went from Stable Diffusion v1.4 to Mixtral to Llama 3.1 8B

→ View original post on X — @reach_vb,

4 December 2024
Playing with Model Checkpoints Directly

By

@reach_vb

–

03 December 2024 22h25

Or play with the model checkpoints directly:

→ View original post on X — @reach_vb,

3 December 2024
Model Checkpoints Available for Download

By

@reach_vb

–

03 December 2024 22h23

Check out the model checkpoints here:

→ View original post on X — @reach_vb,

3 December 2024
Indic-Parler TTS: 20 Indian Languages Speech Synthesis Model

By

@reach_vb

–

03 December 2024 22h22

Introducing Indic-Parler TTS – Trained on 10K hours of data, 938M params, supports 20 Indic languages, emotional synthesis, apache 2.0 licensed! 🔥

A collaboration w/ @ai4bharat & @huggingface – w/ fully customisable speech and voice personas!

Try it out directly below or use… pic.twitter.com/8aagHcCT61
— Vaibhav (VB) Srivastav (@reach_vb) 3 décembre 2024

Introducing Indic-Parler TTS – Trained on 10K hours of data, 938M params, supports 20 Indic languages, emotional synthesis, apache 2.0 licensed! A collaboration w/ @ai4bharat & @huggingface – w/ fully customisable speech and voice personas! Try it out directly below or use

→ View original post on X — @reach_vb,

3 December 2024
Impact of Running an AI Startup on Founders

By

@reach_vb

–

03 December 2024 22h11

what running a AI startup does to a man

→ View original post on X — @reach_vb,

3 December 2024