Groq LPU System Achieves 240 Tokens Per Second with Llama-2 - AI Dynamics

Skip to content

AI Dynamics

Global AI News Aggregator

Rechercher

Groq LPU System Achieves 240 Tokens Per Second with Llama-2

By

–

31 August 2023 16h09

Our LPU™ system is pushing the limits on LLM #inference perf again, now running Llama-2 70B at 240 tokens per sec per user! CEO @JonathanRoss321 shares more on the >2x improvement, why ultra-low latency matters, and if GPUs can still catch up. More at http://
groq.link/240tps

→ View original post on X — @groqinc

31 August 2023

AI AI HARDWARE COMPUTING GENERATIVE AI INNOVATION LLMS MACHINE LEARNING

←Groq Platinum Sponsor at AI for Defense Summit 2023

Optimal resource use aligned with values and goals→

MORE ARTICLES

Disable memories in Codex via /memories

25 June 2026
AI agent NEWTON uses keyframes and simulators to enforce physics

25 June 2026
Humanity’s immune response to mediocre AI content

25 June 2026
Google Flow Agent generates images and videos via Street View in US

24 June 2026

INNOVATION GENERATIVE AI RESEARCH LLMS TOOLS MACHINE LEARNING CODE MARKET TRENDS BUSINESS TECHNOLOGY BIG TECH ETHICS ENTERPRISE AI SOFTWARE AGENTS APPS AUTOMATION COMPUTING DATA POLICY OPEN SOURCE CULTURE MULTIMODAL AI REGULATION CREATIVE AI PROMPT ENGINEERING ECONOMY SOCIETY SAFETY INVESTMENT EDUCATION AI HARDWARE AGI HARDWARE JOBS STARTUPS INDUSTRY ROBOTICS WORKFORCE SECURITY CYBERSECURITY HEALTHCARE AI SYSTEMS SUSTAINABILITY WEB3 DECENTRALIZED AI

AI Dynamics

Global AI News Aggregator

About
Archives

Rechercher