AI Dynamics

Global AI News Aggregator

Inflection-2.5 Evaluation: MT-Bench Corrections and Physics GRE Benchmark

Evaluation is everything! While testing Inflection-2.5, we found that MT-Bench has a bunch of incorrect answers. Here we share the corrections for everyone to use, and we release a new Physics GRE benchmark for people to try out. inflection.ai/inflection-2-5

→ View original post on X — @inflectionai, 2024-03-07 15:15 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *