AI Dynamics

Global AI News Aggregator

About

Feature Steering Controls Social Bias in AI Models

Next, we found that feature steering can indeed increase or decrease various forms of social biases in targeted ways. For example, dialing up the "Gender bias awareness" feature significantly increased the gender bias scores in our evaluations.

→ View original post on X — @anthropicai