AI Dynamics

Global AI News Aggregator

FlashAttention-2: 2x Faster Attention with Better Parallelism

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning FlashAttention-2 is 2x faster than previous version. Release blog: https://
crfm.stanford.edu/2023/07/17/fla
sh2.html
… Paper: https://
tridao.me/publications/f
lash2/flash2.pdf

→ View original post on X — @jeande_d,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *