Full and Long CoT boost reasoning by expanding intermediate steps—but at a high token cost. Not ideal for latency- or cost-sensitive apps. This paper introduces Fractured Sampling, a practical inference-time technique that turbocharges reasoning without retraining, using far
Fractured Sampling Boosts Reasoning Inference Without Retraining
By
–
