Start by structuring the rubric to combine evaluation criteria distributed across a hierarchy of granularity. PaperBench provides great examples – S/O to @tejalpatwardhan and team @OpenAI for that paper:
Structuring AI Evaluation Rubrics with Hierarchical Granularity
By
–
Leave a Reply