Anthropic researchers have discovered a new "jailbreaking" technique called "many-shot jailbreaking". It can evade the safety guardrails of LLMs by exploiting expanded context windows. Pretty wild.
Anthropic Discovers Many-Shot Jailbreaking Technique for LLMs
By
–
Leave a Reply