llama.cpp with MTP support makes local models fast enough to use as daily drivers 🚀
— clem 🤗 (@ClementDelangue) 24 mai 2026
Qwen3.6-27B dense generation below on A10G: From 25 tok/st to 45 tok/s (+78%)! pic.twitter.com/rLjBVa3Yzh
llama.cpp with MTP support makes local models fast enough to use as daily drivers Qwen3.6-27B dense generation below on A10G: From 25 tok/st to 45 tok/s (+78%)!