“Negation Neglect: When models fail to learn negations in training” LLMs can understand a disclaimer in-context, but often fail to learn it during finetuning. So when training on documents saying a claim is false can still implant the claim as true. Qwen3.5 belief in
LLM Failure Modes in Learning Negations During Fine-tuning
By
–
