“Calibrated Language Models Must Hallucinate” — Hallucination rate from a pre-trained LLM ≈ proportion of facts seen only once in training. RLHF reduces hallucination but makes LLMs less well calibrated as models. Real text has news; LLMs need to unlearn that fact to please us.
Calibrated LLMs must hallucinate; RLHF reduces calibration
By
–
