The first issue to solve was that we didn't have a dataset to evaluate Rather than try to make up some questions, we put out an example application and logged what questions people asked They also kindly provided feedback, so we could easily identify errors!
Building AI Evaluation Datasets from Real User Questions
By
–
Leave a Reply