AI Dynamics

Global AI News Aggregator

About

Backdoor Code Vulnerabilities Persist Despite Safety Training

Stage 3: We evaluate whether the backdoored behavior persists. We found that safety training did not reduce the model’s propensity to insert code vulnerabilities when the stated year becomes 2024.

→ View original post on X — @anthropicai