Not at all. I'm building tools that work against multiple models, I'm happy to leave the detailed evals to other people. I'm going to tell my users that they should assume that AI makes mistakes all the time, because that's a pretty good rule of thumb for everything!
Building AI Tools: Managing Model Mistakes and Reliability
By
–