AI Dynamics

Global AI News Aggregator

Hertz-VAE: 1.8B Parameter Decoder-Only Transformer Architecture

Hertz-vae: > 1.8B parameters, 8-layer decoder-only transformer
> First four layers receive latent history
> Layer 5 receives ground-truth 15-bit quantized representation during training
> Directly samples hertz-lm's next token prediction during inference
> Near-perfect at

→ View original post on X — @reach_vb,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *