AI Dynamics

Global AI News Aggregator

About

ChatGPT Jailbreak Logic: Normal Response Prefix Techniques

wow this logic actually explains a lot of the characteristics of jailbreaks that I have observed in practice the newest jailbreaks all work by either having ChatGPT answer as it would normally before proceeding to give an off-the-cuff answer or by prefixing its responses first

→ View original post on X — @alexalbert__