First, we find larger LMs are more biased on the BBQ benchmark. Prompting models to avoid bias by giving them instructions (IF) and asking for reasoning (CoT) reverses the trend but only for the largest models and only with enough RLHF training! (Darker lines = more RLHF)
Larger Language Models Show More Bias on BBQ Benchmark
By
–
Leave a Reply