AI Dynamics

Global AI News Aggregator

About

FlashAttention-2: 2x Faster Attention with Better Parallelism

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning FlashAttention-2 is 2x faster than previous version. Release blog: https://
crfm.stanford.edu/2023/07/17/fla
sh2.html
… Paper: https://
tridao.me/publications/f
lash2/flash2.pdf

→ View original post on X — @jeande_d