Part of the reason this is so confusing IMO is they don’t make the same claim about o1 pro, which is still a mystery modification to the purely autoregressive o1. All they’ve really said about pro I think is that it’s not merely o1 with higher reasoning_effort.