wow this logic actually explains a lot of the characteristics of jailbreaks that I have observed in practice the newest jailbreaks all work by either having ChatGPT answer as it would normally before proceeding to give an off-the-cuff answer or by prefixing its responses first
ChatGPT Jailbreak Logic: Normal Response Prefix Techniques
By
–
Leave a Reply