The plot gets wilder: the prof's evidence for hallucinations has been allegedly solved is a chart from OpenAI showing that all models test hallucinated at least 4.6% of the time on known (therefore somewhat gameable) benchmark. That certainly isn't "solved". Imagine if your accountant hallucinated 4.6% of the time. Or your pilot. [Translated from EN to English]
→ View original post on X — @garymarcus, 2026-04-06 22:21 UTC
Leave a Reply