AI Dynamics

Global AI News Aggregator

About

AI Educational Outreach: Lectures, Essays, Blogs, and Social Media

KVCache quantization is a no-no as well I’d rather quantize the model to 2-bit rather than quantize the KVCache to 4-bit or even 8-bit

→ View original post on X — @theahmadosman