I'm not only GPU poor but disk poor too. 350GB?
(And ofc doing so wouldn't be representative of the full data distribution)
Also while replying, ideally there could be a "dataset miniseries", e.g. 1B, 10B, 100B, and then full. I think would be very helpful and bandwidth saving.
GPU and Disk Constraints: Need for Smaller Dataset Variants
By
–
Leave a Reply