AI Dynamics

Global AI News Aggregator

About

Claude 3 vs GPT-4: Spatial Reasoning Task Comparison

A quick comparison b/w Claude 3 and GPT-4 on a spatial reasoning task (n=100, 5 run average w/ temp=1.0). Seems like Claude 3 still beats GPT-4, and gpt-4-turbo performs worse than gpt-4-0613. Interesting contrast to their perf in chat & coding, where GPT-4 comes out ahead. 1/n

→ View original post on X — @_yutaroyamada,