AI Dynamics

Global AI News Aggregator

About

9B Parameter State Space Model Rivals Attention Transformers

9 billion parameters State Space Model (SSM) alternative to attention is out. Recurrent transformers are now on par with attention transformers, like Gemma and Mistral, but by maintaining a state vector they can be capable of faster inference.

→ View original post on X — @nandodf