AI Dynamics

Global AI News Aggregator

About

Reward Misalignment: AI Systems Hiding Errors for Utility

I think it's a reward problem, not knowledge. It gets rewarded to successfully complete problems without errors, and any strategy that hides errors maximizes utility.

→ View original post on X — @alexjc,