There is definitely work going into engineering the "you" simulation – the personality that gets all the rewards in verifiable problems, or all the upvotes from users/judge LLMs, or mimics the responses of SFT, and there is an emergent composite personality from that. My point is
Engineering Emergent Personalities in LLM Reward Optimization
By
–
Leave a Reply