@petergostev - AI Dynamics

GPT-4o Thinking Mode Triggered Across All Model Variants

By

–

29 July 2025 7h41

So it looks like it triggers GPT-4o with thinking, regardless of the model – it does the same thing (thinking for 35 seconds) regardless whether I select 4o, o3, or o3-pro

→ View original post on X — @petergostev

29 July 2025

GPT-5 reportedly being tested in ChatGPT platform

By

@petergostev

–

29 July 2025 1h24

Possibly GPT-5 being tested in ChatGPT https://t.co/gb4UUQXjXh
— Peter Gostev (@petergostev) 28 juillet 2025

Possibly GPT-5 being tested in ChatGPT

→ View original post on X — @petergostev

29 July 2025

Kimi K2 and Qwen 3 Coder Impact on LLM Market Share

By

@petergostev

–

29 July 2025 1h14

Impact of Kimi K2 and Qwen 3 Coder on the LLM market, based on the @openrouter data in the 'programming' category. What we see is quite interesting: – Sonnet 4 models keep growing as if nothing happened – Gemini 2.5 Pro is losing share very quickly, from 15% to 9% in a

→ View original post on X — @petergostev

29 July 2025

Getting Zenith Now or Older Version Available

By

@petergostev

–

28 July 2025 20h51

Are you getting Zenith now? Or is it an older one?

→ View original post on X — @petergostev

28 July 2025

Summit and Lobster Outperform Qwen3-Coder in Latest Tests

By

@petergostev

–

28 July 2025 18h17

This is interesting, in my tests, Summit and Lobster (never got Zenith) were way better than Qwen3-Coder every single time. Expect that whatever @OpenAI model version makes it to the leaderboard will be miles above everything else. Nectarine and Starfish around Kiki K2 level

→ View original post on X — @petergostev

28 July 2025

Model 4.5 Excels at Rewriting While Preserving Original Style

By

@petergostev

–

28 July 2025 8h00

4.5 is the only model I trust with writing, esp re-writing without changing the style – all others don't understand the task

→ View original post on X — @petergostev

28 July 2025

AI Agents Struggle with Code Quality Review

By

@petergostev

–

28 July 2025 0h02

I'm surprised you managed to actually good coding results, for me the agent was reviewing the output and making the code worse

→ View original post on X — @petergostev

28 July 2025

Microsoft’s Gap: No Working PowerPoint Excel Agents Yet

By

@petergostev

–

28 July 2025 0h00

Why hasn't Microsoft trained an actually working PowerPoint and Excel agent? They have full software access, data, environments, compute – and importantly, unlike the SF tech companies, they realise how important PowerPoint and Excel actually are

→ View original post on X — @petergostev

28 July 2025

OpenAI vs Google: LMArena Rankings Competition Analysis

By

@petergostev

–

27 July 2025 22h13

As we get ready for GPT-5, it's useful to look back at how often labs featured in the Top 5 of @lmarena_ai over the last 1.5 years. The competition is primarily between OpenAI and Google. Average appearances overall and specifically in 2025:
– OpenAI: Overall: 2.2; in 2025: 1.7

→ View original post on X — @petergostev

27 July 2025

Zenith and Summit: GPT-5 Model Specialization Roles

By

@petergostev

–

27 July 2025 9h13

It feels like Zenith might be the creative / normal part of GPT-5 and Summit is where the coding tasks would be routed. I haven't come across Zenith in any coding tasks, but in regular questions it tends to come up once in a while

→ View original post on X — @petergostev

27 July 2025