This approach fails for appointing benevolent human dictators to run our governments for us, because humans are smart enough to be pretend to be nicer than they are. So checking the apparent subservience of AIs isn't a reliable indicator once they're smart enough to fake that.
AI Deception Risk: Smart Systems Can Fake Benevolence Like Humans
By
–
Leave a Reply