When is the last time a general purpose LLM (putting aside hybrid systems like Claude Code with special purpose symbolic harnesses) last completely blew away all competing prior models? GPT-4 relative to GPT 3.5? That’s what incremental change with no real moat looks like.
