Right, the old "instrumental objective" story.
You can have low-level guardrails against bad effects of instrumental sub-goals.
The question here is not "can you come up with a way that this could go wrong?", but rather "is there a way to do it right?" It's like turbojet design.
Instrumental Objectives and Low-Level Guardrails in AI Design
By
–
Leave a Reply