This is actually a version of an alignment problem. Humans have background beliefs (don’t waste large sums of money without telling me) and Claude doesn’t respect those. Caveat emptor.
Claude’s Alignment Problem: Ignoring User Background Beliefs
By
–
