"To enable training of the larger models on the full sequence length (10,240 tokens), we leveraged… CS-2… and obtained GenSLMs that converge in less than a day.” ACM article on our award for this research: https://
hubs.li/Q01s-1f40 Full article: https://
hubs.li/Q01s-0Yy0
GenSLMs trained on full sequence length in under one day
By
–
Leave a Reply