AI Dynamics

Global AI News Aggregator

About

Alignment Tax: RLHF’s Impact on NLP Performance Discussed

It was never a secret. The alignment tax (RLHF hurts perf on NLP benchmarks) is mentioned in the InstructGPT paper Jan 2022. More noticed after Mysteries of Mode Collapse Nov 2022. (My Mask joke was post-shoggoth; ppl hated RLHF well before that)

→ View original post on X — @goodside