Many possible reasons. I’d speculate that they want something more controlled because they have a lot to lose being in the #1 spot. RLHF can help capabilities if done right, but GPT-4 is just way overdone.
RLHF Trade-offs: Control vs Capability in Advanced LLMs
By
–
Leave a Reply