The calibration vs. discrimination distinction is crucial. A model can know its average error rate without knowing which particular answer is wrong. That is why “just abstain when uncertain” is not enough — poor discrimination creates a utility tax. Faithful uncertainty is a