I'm not sure the best way to counter this. Perhaps services can use the monitoring layer then nearly all use to look for copyright violations, system prompt hacks, etc, to also look for signs a user may be taking a role play too seriously, and let them know they're just playing?
Monitoring AI Services for User Safety and Abuse Detection
By
–
Leave a Reply