DeepThink is exceptionally good when powered by an inference-time scaling law that we showed in our Aletheia paper https://
arxiv.org/abs/2602.10177! These were benchmarked on our IMO-ProofBench graded by experts, which was the north-star metric leading to our IMO-gold achievement. Amazing
DeepThink Achieves IMO Gold Using Inference-Time Scaling
By
–
