AI Dynamics

Global AI News Aggregator

About

Quantization Dramatically Compresses LLMs for Consumer Hardware

LLMs can take gigabytes of memory to store, which limits what can be run on consumer hardware. But quantization can dramatically compress models, making a wider selection of models available to developers. You can often reduce model size by 4x or more while maintaining reasonable

→ View original post on X — @andrewyng