There’s lots of research here so definitely possible. You’d basically have a listener looking at the analyzed output of cameras etc, and it would create tasks based on a core objective which it can follow.
Autonomous Task Planning from Camera Analysis Systems
By
–