Groq LPU Engine: Kernel-free Llama 2 70B Inference Performance

AI Dynamics

Global AI News Aggregator

Groq LPU Engine: Kernel-free Llama 2 70B Inference Performance

–

05 January 2024 8h29

Kernel-free, no Cuda, Compiler-only solution.
The current implementation, running on our LPU Inference Engine uses our own processor. The model is Llama 2, 70B 4k sequence at FP16…We haven't even hit the next gear through those other methods. Plenty in the tank left!

→ View original post on X — @groqinc,

5 January 2024

AI AI HARDWARE COMPUTING GENERATIVE AI HARDWARE INNOVATION LLMS MACHINE LEARNING OPEN SOURCE RESEARCH

AI Dynamics

Groq LPU Engine: Kernel-free Llama 2 70B Inference Performance

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring