AI Dynamics

Global AI News Aggregator

About

Clock Benchmark: Models Struggle with Time Recognition Task

The “Clock” benchmark measures the models' ability to recognize the time. I don't know if I'm more surprised that less than 90% of people can read a clock themselves, or that the best models currently don't exceed 14% accuracy. Anyway, cool benchmark!

→ View original post on X — @kimmonismus