yo! @NVIDIAAIDev finally released the weights for Hymba-1.5B – outperforms Llama, Qwen, and SmolLM2 with 6-12x less training trained ONLY on 1.5T tokens > massive reductions in KV cache size and improved throughput
> combines Mamba and Attention in a hybrid parallel
NVIDIA releases Hymba-1.5B open-source language model weights
By
–
Leave a Reply