AI Dynamics

Global AI News Aggregator

About

Hybrid Model with Multiplicative Gates and Short Convolutions

The result is a hybrid model with multiplicative gates and short convolutions: – 10 double-gated short-range LIV convolution blocks
– 6 grouped query attention (GQA) blocks It's REALLY fast, especially on CPU!

→ View original post on X — @maximelabonne