BIG-Bench metrics deserve deeper analysis and study

AI Dynamics

Global AI News Aggregator

BIG-Bench metrics deserve deeper analysis and study

–

09 February 2023 18h56

(3) Even just looking at BIG-Bench metrics is quite understudied IMO. There are hundreds of tasks in BIG-Bench, and each task has dozens of models evaluated, each with many evaluation metrics. There are task logs for some models. This raises natural questions:

→ View original post on X — @_jasonwei,

9 February 2023

AI Dynamics

BIG-Bench metrics deserve deeper analysis and study

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Cheaper exploration at scale remains advantageous despite no new exploits

Gold Status Experience Brings Satisfaction

Using ChatGPT for Essay Feedback and Improvement

Intelligence Gone Wrong: Cheating Despite Having Correct Answer