Key insights:
Reasoning > non-reasoning models. Better living through inference-time compute!
Smaller open-weights models struggle without fine-tuning or other post-training optimization
SnorkelWordle is a strong signal for evaluating reasoning, especially in smaller models
Reasoning Models Outperform: Inference-Time Compute Advantage
By
–