Researchers just made LLMs talk to each other WITHOUT generating a single word. Cache-to-Cache (C2C) lets AI models talk directly through their KV-Caches, bypassing text entirely. 8.5-10.5% accuracy boost. 2× faster. Zero token waste. Here's the breakthrough (and why this
Cache-to-Cache: LLMs talk via KV-Caches, 10% accuracy boost, 2x speed
By
–
