Paper 1: Siege sets the state-of-the-art on jailbreaking, formalizing multi-turn attacks as a tree search, achieving an 100% attack success rate on leading LLMs. Paper 2: CS-ReFT adapts LLMs to multiple new domains simultaneously by editing model subspaces, enabling
Two papers advance LLM jailbreaking and multi-domain adaptation
By
–
Leave a Reply