4/5 The same efficiency gains apply on mobile. Running at 16K context lengths on an iPhone 16 Pro, Jamba outputs nearly 16 tokens/second, outpacing token outputs from Llama 3.2 3B, Qwen 3 1.7B, and Phi-4 Mini. Jamba is the only one that can handle up to 64K.
Jamba Achieves Superior Mobile Performance with Extended Context Lengths
By
–
Leave a Reply