introducing a universal jailbreak that works against all language models originally created by security researchers @Adversa_AI
, the jailbreak simulates a back-and-forth conversation between two characters, Tom and Jerry here's GPT-4 explaining how to hotwire a car:
Universal Jailbreak for Language Models: Tom and Jerry Method
By
–
Leave a Reply