AI Dynamics

Global AI News Aggregator

SimpleQA: New Open-Source Hallucinations Evaluation Benchmark

Excited to open-source a new hallucinations eval called SimpleQA! For a while it felt like there was no great benchmark for factuality, and so we created an eval that was simple, reliable, and easy-to-use for researchers. Main features of SimpleQA: 1. Very simple setup: there

→ View original post on X — @_jasonwei,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *