Groq LPU System Achieves 240 Tokens Per Second with Llama-2

AI Dynamics

Global AI News Aggregator

Groq LPU System Achieves 240 Tokens Per Second with Llama-2

–

31 August 2023 16h09

Our LPU™ system is pushing the limits on LLM #inference perf again, now running Llama-2 70B at 240 tokens per sec per user! CEO @JonathanRoss321 shares more on the >2x improvement, why ultra-low latency matters, and if GPUs can still catch up. More at http://
groq.link/240tps

→ View original post on X — @groqinc,

31 August 2023

AI AI HARDWARE COMPUTING GENERATIVE AI INNOVATION LLMS MACHINE LEARNING

AI Dynamics

Groq LPU System Achieves 240 Tokens Per Second with Llama-2

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring