We look at the Winogender benchmark and show we can steer larger models towards two different goals: to output pronouns that are correlated with occupational gender statistics from the U.S. Bureau of Labor Statistics (red) or to move away from using stereotypical pronouns (green)
Steering Language Models Away From Gender Stereotypes in Occupations
By
–
Leave a Reply