
This is such a vibes-based eval, but the first prompt I give any new LLM is “Which version is this?” and DeepSeek-V3 nailed it See below for how Claude, Gemini, ChatGPT, and Grok fare on the same — TLDR: it’s all over the map
By
–


This is such a vibes-based eval, but the first prompt I give any new LLM is “Which version is this?” and DeepSeek-V3 nailed it See below for how Claude, Gemini, ChatGPT, and Grok fare on the same — TLDR: it’s all over the map