1. Mac Studio M2 Ultra running Gemma 4 26B at 300 tokens/sechttps://t.co/XXyps9Y7OQ
— AI Highlight (@AIHighlight) 6 avril 2026
1. Mac Studio M2 Ultra running Gemma 4 26B at 300 tokens/sec nitter.net/ggerganov/status/20397… Georgi Gerganov (@ggerganov) Let me demonstrate the true power of llama.cpp: – Running on Mac Studio M2 Ultra (3 years old) – Gemma 4 26B A4B Q8_0 (full quality) – Built-in WebUI (ships with llama.cpp) – MCP support out of the box (web-search, HF, github, etc.) – Prompt speculative decoding The result: 300t/s (realtime video) — https://nitter.net/ggerganov/status/2039752638384709661#m
→ View original post on X — @aihighlight, 2026-04-06 12:42 UTC
Leave a Reply