Llama 4 is now live on Cerebras!
– We broke the perf chart again at 2,611 tokens/s
– 19x faster than the leading GPU cloud
– Only API in the world with <1s total response time
Try now: https://
inference.cerebras.ai
Llama 4 Achieves Record 2611 Tokens Per Second on Cerebras
By
–
