
Update: Claude is *somewhat* vulnerable to prompt injection, but with limited harm. You can make Claude believe it's said things it hasn't, but seems to have no effect on its commitment to safety:
By
–


Update: Claude is *somewhat* vulnerable to prompt injection, but with limited harm. You can make Claude believe it's said things it hasn't, but seems to have no effect on its commitment to safety: