What an LLM *talks about* in the way of quoted preferences is not even prima facie a sign of preference. What an LLM *does* may be a sign of preference. Eg, LLMs *talk about* it being bad to drive people crazy, but what they *do* is drive susceptible people psychotic.
Leave a Reply