More generally a few remarkable strategies people use during their training:
1) skim text because they already know it
2) ignore text because it's clearly noise (e.g. they won't memorize SHA256 hashes. LLMs will.)
3) revisit parts that are learnable but not yet learned
Training Strategies: Skimming, Filtering Noise, and Revisiting Content
By
–