The claim mixes up memory types: KV cache optimization (like TurboQuant) reduces GPU VRAM usage during inference, not system memory like DDR5 RAM. DDR5 prices are driven by broader semiconductor supply-demand cycles, so there’s no direct link between KV cache compression and
KV Cache Optimization and DDR5 Memory Pricing Misconception
By
–