Very nice post-training lesson from the new Qwen3 model: – A pure reasoning model is expected to perform better on pure reasoning benchmarks. Nothing surprising. – We previously saw that pure reasoning models also perform very well on non-reasoning tasks, like creative writing.
