Yes. For one, “Bayesian inference” is not Bayesian. Every (frequentist) n-gram model uses Bayes’ theorem. For another, LLMs have high capacity and are trained to minimize cross-entropy, which is equivalent to maximizing likelihood, so it’s not surprising they produce accurate