(2/3) If you are interested in … – Defining evaluations for checking whether a model is safe enough to deploy – Detecting and stop harmful use cases. – Training models to say no to harmful requests and to be robust to jailbreak style vulnerabilities.
Model Safety Evaluation and Jailbreak Robustness Standards
By
–