My colleagues already added Flash Attention 2 to our Lit-GPT repo!
So if you are working on the NeurIPS LLM Efficiency Challenge (for which Lit-GPT is the official starter kit, https://
llm-efficiency-challenge.github.io), you can shave ~11% off your total runtime
Flash Attention 2 Integration Reduces Lit-GPT Runtime by 11%
By
–
Leave a Reply