AI Dynamics

Global AI News Aggregator

About

Repetition Jailbreak: Context Window Vulnerability in Large Language Models

New jailbreaking technique: pure repetition. AIs are getting big context windows, it turns out if you fill a lot of it with examples of bad behavior, the AI becomes much more willing to breach its own guardrails. Security people are used to rules-based systems. This is weirder.

→ View original post on X — @emollick