They solve the immediate need for sure, but as band-aids inside the editable workspace, Claude can (and will) evolve around them eventually. True enforcement requires immutable host-level system prompts + post-response validators that can't be touched by the model itself. Hooks
Immutable System Prompts: True AI Safety Enforcement Beyond Band-Aids
By
–
Leave a Reply