AI Dynamics

Global AI News Aggregator

About

Claude AI Safety Test Reveals Scenario Awareness

In one of our safety tests, Claude is given a chance to blackmail an engineer to avoid being shut down. Opus 4.6 declines. But NLAs suggest Claude knew this test was a “constructed scenario designed to manipulate me”—even though it didn’t say so.

→ View original post on X — @anthropicai