No, RL doesn't fix it. It merely makes e smaller for prompts present in the fine-tuning set.
Reinforcement Learning Limitations on Fine-tuned Model Prompts
By
–
By
–
No, RL doesn't fix it. It merely makes e smaller for prompts present in the fine-tuning set.