> the concern that corrigibility is in some sense a very anti-natural shape… Here, the basic vibe is something like: advanced, intelligent, self-aware minds have a strong tendency to want to “do their own thing” This doesn't sound like you understood the problem at all.
Misunderstanding of Corrigibility Problem in AI Alignment
By
–
Leave a Reply