It turns out we can. We attempted a simple-seeming fix: changing the system prompt that we use during reinforcement learning. We tested five different prompt addendums, as shown below:
System Prompt Changes Improve AI Reinforcement Learning Results
By
–
