Good call out. Speculative decoding turns decode into a verification step, so one memory read covers multiple tokens instead of just one. In prod the wins depend on draft model quality and acceptance rate. With a well-matched draft, 2 to 3x speedups are realistic. Covering