AI Dynamics

Global AI News Aggregator

About

Jamba’s 256K Context Reveals fused_moe Kernel Issues

BTW, Since Jamba supports a 256K context with high throughput, we also stumbled upon an issue where the fused_moe kernel didn’t work well in long contexts. Others seems to have had this too, according to some other open issues

→ View original post on X — @ai21labs