Yes, definitely, my last tweet few seconds ago is also on this point. And many pretraining datasets also care about e.g. multilignual, code, math, etc., so it's not clear how those evals would be affected.
Pretraining Datasets and Evaluation Metrics for Multilingual Models
By
–
Leave a Reply