At 4pm PST today, join @cawelty at the #NeurIPS2024 Google Research booth to talk about how to determine if realistic, complex data have enough responses per item to reliably evaluate & compare model performance via hypothesis test based power analysis. https://
arxiv.org/abs/2412.02968
Statistical Power Analysis for Model Performance Evaluation
By
–
