We broke all records when we launched Cerebras Inference in August. Today we are tripling our performance from 650 t/s to 2100 t/s.
Cerebras Inference speed is in a league of its own – 16x faster than the fastest GPU solution, 68x faster than hyperscale clouds, and 4-8x faster
Cerebras Triples Inference Speed to 2100 Tokens per Second
By
–
