AI Dynamics

Global AI News Aggregator

About

Training Data Misalignment Causes Model Behavioral Drift

Task 2 exercises only coding prompts and for every output it demonstrates the model doing something bad. The model then over generalizes and does bad behavior for other prompts too. But the misalignment came from the task it was trained to do.

→ View original post on X — @goodfellow_ian,