(2/n) MediSwift capitalizes on our most recent innovations in sparsity, inducing up to 75% unstructured weight sparsity during in-domain pre-training on biomedical texts. This results in a 2-2.5x reduction in the required training FLOPs. Blog: https://
cerebras.net/blog/sparsity-
made-easy-introducing-the-cerebras-pytorch-sparsity-library
…
MediSwift Achieves 75% Sparsity, Reduces Training FLOPs by 2.5x
By
–
Leave a Reply