A team led by @songhan_mit just released XAttention, a plug-and-play framework that accelerates attention computation by 13.5×!
XAttention Framework Accelerates Attention Computation 13.5x
By
–

By
–

A team led by @songhan_mit just released XAttention, a plug-and-play framework that accelerates attention computation by 13.5×!