The amount of code it was actually outputting was kind of crazy, especially comparing it to previous non-reasoning models, it is so much better than older ones
@petergostev
-

Reasoning Capabilities Differ Significantly Across AI Models
By
–
How much does ‘reasoning’ matter for different models? It matters a lot for GPT-5 and less for models like Opus 4.1 and 4.0. From looking at the reasoning traces, models clearly ‘think’ differently: Opus and Sonnet tend to ‘plan’, laying out how it would solve the problem,
-
Where is Chonky Claude Among Latest AI Models?
By
–
We have Gemini DeepThink, GPT-5-Pro, Grok Heavy… Where is Chonky Claude?
-

OpenAI Reasoners vs Non-Reasoners: Performance Comparison Analysis
By
–
How do OpenAI reasoners compare to non-reasoners? Looking at the @lmarena_ai rankings, we can compare models’ overall ratings with their category ratings, and clear patterns emerge. Reasoners (o3, o4-mini, GPT-5-Thinking) are much better at Maths and Hard Problems, while
-

GPT-5 Pricing and Performance Compared to Mini and Chat Models
By
–
That's very interesting – I thought 'mini' might rank a bit higher than this. 'Chat' being below GPT-4o is also kind of interesting, considering it is similarly priced (GPT-5 input is cheaper, but output same)
-
More Compute Power: The AI Industry’s Endless Appetite
By
–
I like my compute like I like my butter – more
-
GPT-5 GB200 FP4 Precision and Token Speed Analysis
By
–
So GPT-5 is running on GB200, to take full advantage of it, it could be in FP4. So I wonder if this change is hiding the size of the model somewhat. If it was FP8 and on H100, it would be now running at something like 10 tokens per second rather than current 50 tps.
-
Simplify reasoning effort adjustment in AI interfaces
By
–
Could we make it cleaner to change the reasoning effort? I always forget how to do it and hard to find
-
OpenAI’s GPT-5-Pro: Cost Optimization Over Capability Push
By
–
Given that we saw OpenAI test better checkpoints on the arena, it does also seem like they have also optimised for cost / mass appeal rather than pushed the capabilities as far as they could. We also don't have almost any benchmarks for GPT-5-Pro, which is genuinely superb and