Haven't read carefully, but Figure 1 suggests that with more pre-training, you need less fine-tuning with human feedback (which is usually the case with fine-tuning). So at scale this would make potentially no difference?
Pre-training scale reduces need for human feedback fine-tuning
By
–
Leave a Reply