AI Dynamics

Global AI News Aggregator

Mac Studio M2 Ultra runs Gemma 4 26B at 300 tokens per second

1. Mac Studio M2 Ultra running Gemma 4 26B at 300 tokens/sec nitter.net/ggerganov/status/20397… Georgi Gerganov (@ggerganov) Let me demonstrate the true power of llama.cpp: – Running on Mac Studio M2 Ultra (3 years old) – Gemma 4 26B A4B Q8_0 (full quality) – Built-in WebUI (ships with llama.cpp) – MCP support out of the box (web-search, HF, github, etc.) – Prompt speculative decoding The result: 300t/s (realtime video) — https://nitter.net/ggerganov/status/2039752638384709661#m

→ View original post on X — @aihighlight, 2026-04-06 12:42 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *