Devs, let's chat at @NeurIPSConf about a previous paper: Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge. The idea is to introduce a data pipeline to generate domain-specific evals for LLM-as-a-Judge, for uses cases. Paper
Domain-Specific Evaluation Sets for LLM-as-a-Judge Pipeline
By
–
Leave a Reply