AI Dynamics

Global AI News Aggregator

About

Anthropic Studies Feature Steering in AI Systems

New Anthropic research: Evaluating feature steering. In May, we released Golden Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature steering”. We've now done a deeper study on the effects of feature steering. Read the post: http://
anthropic.com/research/evalu
ating-feature-steering

→ View original post on X — @anthropicai