AI Dynamics

Global AI News Aggregator

Testing LLM Applications: Metrics and Evaluation Methods

How are people testing? Testing LLM Apps is hard. How are people doing it? 1. We see that 83% of test runs have some form of feedback, suggesting that most people are finding some metrics to eval (rather than just eyeball) 2. We see an average of 2.3 feedback per run,

→ View original post on X — @langchain,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *