AI Dynamics

Global AI News Aggregator

About

LLM comparison: which models solved the contrived problem?

Contrived problem, I know, but: ChatGPT 4o, o1-mini, and Claude 3.5 Sonnet all get this wrong — 0 out of 3 each ChatGPT o1 gets it right 3 out of 3

→ View original post on X — @goodside