AI Dynamics

Global AI News Aggregator

Gemma4 vindicated: dense models triumph over mixture of experts

Gemma4 is amazing. You'll read that everywhere. Let's focus on what is HUGE here: the revenge of dense models…. Throw away your b200, not needed anymore, throw away the millions of lines of code we had to write to make MOEs faster, training stable etc… throw away your router-aware kernel, your EP DEEP GEMM, throw away the auxiliary loss function. Welcome to simplicity, dense is the new king. FINALLY hating MoEs is back to being chad. For those who know me: I was always a moe doomer

→ View original post on X — @jeremyphoward, 2026-04-02 16:23 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *