We’re excited to be collaborating with the OpenThoughts Agent team and @bespokelabsai on the release of OpenThoughts-TBLite.
OpenThoughts-TBLite is a 100-task benchmark calibrated for stronger iteration signal — especially for non-frontier terminal agents. Tasks have been
OpenThoughts-TBLite: 100-Task Benchmark for AI Agents
By
–
