AI Dynamics

Global AI News Aggregator

About

LLM Performance Drops on Non-Standard Medical Questions

Another example of a persistent problem with LLMs. They do very well on standard medical questions, but when the right answer is replaced with “none of the above” performance drops. More recent models generally have lower drops in performance. https://
jamanetwork.com/journals/jaman
etworkopen/fullarticle/2837372

→ View original post on X — @emollick