I can see how deep learning methods may struggle to catch up with tree methods on small size datasets (and you say it in the thread). Wondering if you did try to do multitask fine-tuning on eg Transformers and saw a positive benefits? (we observe it in text, see eg Flan/T0)
Deep Learning vs Tree Methods on Small Datasets and Multitask Fine-tuning
By
–
Leave a Reply