I just ran DeepSeek R1 on smolagents benchmark.
It's an absolute beast Looking forward to run this beast on the full GAIA benchmark! (smolagents benchmark only tests a sample, with a basic CodeAgent setup)
Evaluating DeepSeek R1 on AI Benchmarks
By
–
