FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning FlashAttention-2 is 2x faster than previous version. Release blog: https://
crfm.stanford.edu/2023/07/17/fla
sh2.html
… Paper: https://
tridao.me/publications/f
lash2/flash2.pdf
…
FlashAttention-2: 2x Faster Attention with Better Parallelism
By
–
Leave a Reply