AI Dynamics

Global AI News Aggregator

About

Red-team prompts: The ‘School of Hard Knocks’ for advanced LLM alignment beyond RLHF

"Red-team prompts" are the next step to improve RLHF and ensure increasingly capable LLMs are aligned — see e.g. their role in Anthropic's Constitutional AI. If RLHF is school for the AI, we need a School of Hard Knocks.

→ View original post on X — @goodside