Our work provides empirical evidence that serious misalignment can emerge from seemingly benign reward misspecification. Read the full paper: https://
arxiv.org/abs/2406.10162
Reward Misspecification Creates Serious AI Misalignment Risks
By
–

By
–

Our work provides empirical evidence that serious misalignment can emerge from seemingly benign reward misspecification. Read the full paper: https://
arxiv.org/abs/2406.10162