New Anthropic research: Evaluating feature steering. In May, we released Golden Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature steering”. We've now done a deeper study on the effects of feature steering. Read the post: http://
anthropic.com/research/evalu
ating-feature-steering
…
Anthropic Studies Feature Steering in AI Systems
By
–
