1- running agents for longer dones not increase success rate to 100% : a dumb model will never solve hard tasks no matter how long it runs
(Same as in an IQ test actually, to use your analogy: spending even days on a task you don't understand won't help you solve it)
2- running
AI Agents’ Limitations in Solving Hard Tasks
By
–