New favorite LLM test reveals inconsistent performance across SOTA models

AI Dynamics

Global AI News Aggregator

New favorite LLM test reveals inconsistent performance across SOTA models

–

23 July 2024 2h47

Wow, this has just become my favorite LLM test. I missed that this doesn't work but it really doesn't, even for SOTA LLMs. Seems to be a bit hit and miss, e.g. with GPT4o which failed 1/3 times, Claude failed 3/3 times.

→ View original post on X — @karpathy,

23 July 2024

AI ETHICS GENERATIVE AI INNOVATION LLMS RESEARCH

AI Dynamics

New favorite LLM test reveals inconsistent performance across SOTA models

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring