AI Dynamics

Global AI News Aggregator

About

Cerebras Inference Now Available on Llama API Platform

Cerebras launched inference just 8 months ago. Today it is officially part of Llama API. Any developer can now click a button and get a wafer-scale chip to generate tokens at ~2,600 t/s. Insane progress.

→ View original post on X — @cerebras,