Announcement. @GroqInc is the first to accomplish 100 tokens per second, per user, running @MetaAI Llama-2 at 70B parameter size as an #LLM . No kernels or CUDA libraries necessary! Save thousands of developer hours with our deterministic Compiler methods.
Groq Achieves 100 Tokens Per Second With Llama-2 70B
By
–