Yes, it's very tricky. But things are changing every day.
For example, yesterday we got stuck on one problem, no model was able to solve it, and I was starting to look into it myself.
But then Opuse 4 came out, and I tried it, and it solved it in one shot So models will only
Opus 4 Solves Complex Problems AI Models Couldn’t Handle
By
–