Glad to see this work on alignment; an interesting takeaway for me here is that instruction tuning on data that the model doesn't know can be really, really bad. Basically when you finetune on data the model doesn't know, in addition to teaching it that particular training
Instruction Tuning Risks When Models Lack Training Data Knowledge
By
–
Leave a Reply