AI Dynamics

Global AI News Aggregator

About

Decode 80 tok/sec at 1k, prefill 3000 tok/sec at 50k contexts

Yeah, this was at 50k contexts. Decode is about 80 tok/sec at 1k contexts. Prefill is up to 3000 tok/sec at

→ View original post on X — @rasbt