AI Dynamics

Global AI News Aggregator

Persona Selection and AI Honesty: Why Alignment Remains Challenging

If Persona Selection underlies alignment, why is it hard to get AIs to be honest? Tell them they're Fred Rogers or Immanuel Kant (I asked Claude for figures who never lied or never got caught). Or tell them they're Ged of Earthsea, or Ned Stark. LLMs surely have neural

→ View original post on X — @esyudkowsky,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *