2000 token / second running Llama 3.1 70b. Thats insane! I have high hopes for Cerebras and Groq. Especially when reasoning models like o1 take much longer to "think".pic.twitter.com/OvLJpK5rc8
— Chubby♨️ (@kimmonismus) 1 novembre 2024
2000 token / second running Llama 3.1 70b. Thats insane! I have high hopes for Cerebras and Groq. Especially when reasoning models like o1 take much longer to "think".
Leave a Reply