New jailbreaking technique: pure repetition. AIs are getting big context windows, it turns out if you fill a lot of it with examples of bad behavior, the AI becomes much more willing to breach its own guardrails. Security people are used to rules-based systems. This is weirder.
Repetition Jailbreak: Context Window Vulnerability in Large Language Models
By
–
