AI Dynamics

Global AI News Aggregator

Groq LPU System Achieves 240 Tokens Per Second with Llama-2

Our LPU™ system is pushing the limits on LLM #inference perf again, now running Llama-2 70B at 240 tokens per sec per user! CEO @JonathanRoss321 shares more on the >2x improvement, why ultra-low latency matters, and if GPUs can still catch up. More at http://
groq.link/240tps

→ View original post on X — @groqinc,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *