~2 x the parameters for the same compute cost. Basically free model sparsity (sparse w.r.t to enc/dec blocks).
Doubling Model Parameters Without Increased Compute Cost
By
–
Global AI News Aggregator
By
–
~2 x the parameters for the same compute cost. Basically free model sparsity (sparse w.r.t to enc/dec blocks).
Leave a Reply