If we had billions of dollars to spend we could test this hypothesis empirically — try to fit a big curve on *all the data* — literally everything that's on the Internet, and then test if it can adapt to things that are slightly out of distribution, like ARC tasks. Oh, wait a
Testing scaling hypothesis with internet-scale data and distribution shift
By
–
Leave a Reply