AI Dynamics

Global AI News Aggregator

About

AI Alignment as Entropy Control in Language Model Simulators

In particular, "good, aligned, conversational AI" is just one of many possible different rollouts. Finetuning / alignment tries to "collapse" and control the entropy to that region of the simulator. Jailbreak prompts try to knock the state into other logprob ravines.

→ View original post on X — @karpathy