AI Dynamics

Global AI News Aggregator

Real-world A/B testing better than evals for AI utility

This is the correct take. Evals are helpful but not well-correlated with actual utility. At Otherside, we use A/B tests grounded in real-world traffic, measured against subscriptions and retention. We've tried it all. This is the way.

→ View original post on X — @mattshumer_,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *