AI Dynamics

Global AI News Aggregator

About

PolyAI’s outcome-based benchmark approach for AI models

Most AI companies test their models on generic benchmarks. "Can it code?" "Can it do math?" "Can it write an essay?" PolyAI built their own test: did the customer's issue get resolved? That's it. That's the whole benchmark. And it's even trusted with suicide hotlines and

→ View original post on X — @godofprompt