The C4 dataset has nothing in it longer than 4096, which means that it’s not pushing at longer context lengths. This can have some interesting implications when benchmarking LLMs trained at different sizes. pic.twitter.com/pLqVk4hCN6
— Replit ⠕ (@Replit) 17 juillet 2023
The C4 dataset has nothing in it longer than 4096, which means that it’s not pushing at longer context lengths. This can have some interesting implications when benchmarking LLMs trained at different sizes.
Leave a Reply