Agent-based manipulation of APIs using #LLMs is a popular approach, but consistent and reliable evaluation metrics to assess this has been lacking. In this poster, we introduce a set of benchmarks called ToolBench and evaluate multiple open source #LLMs. @SambaNovaAI Researcher
ToolBench: Evaluating LLM-Based API Agent Performance
By
–
