What are persona vectors? They're directions inside a model's brain (activation space) that represent a specific trait like: • evil
• sycophancy
• hallucination
• optimism
• humor Once extracted, they let you measure, steer, or suppress traits in any LLM.
Understanding Persona Vectors in Large Language Models
By
–
