The first disaggregated inference demo for AI agents is now live. At #COMPUTEX2026, SambaNova demonstrated premium inference running in production at VC2 — using NVIDIA B200 GPUs for prefill and SambaNova RDUs for decode. The result: 2x faster inference than B200-only
SambaNova unveils disaggregated inference demo with 2x speedup
By
–
