A barrier to faster progress in generative AI is evaluations (evals), particularly of custom AI applications that generate free-form text. Let’s say you have a multi-agent research system that includes a researcher agent and a writer agent. Would adding a fact-checking agent
Evaluating Multi-Agent Generative AI Systems: The Fact-Checking Challenge
By
–