AI Dynamics

Global AI News Aggregator

About

Human Baselines Outperform GPT-4 on Non-Square Grid Tasks

We also compare LLMs with human baselines. Although human responses are not perfect, they outperform GPT-4 (0314) by a substantial margin. Furthermore, like GPT-4 (0314), non-expert humans struggle with non-square grid shapes.
6/n

→ View original post on X — @_yutaroyamada,