SFT of smaller models will be around for a while I’m sure, but those are smaller models. With each increase in model scale the tradeoff between ICL and SFT tips in favor of the former, and even more so with long context I think.
ICL vs SFT: Scale and long context favor In-Context Learning for larger models.
By
–