AI Dynamics

Global AI News Aggregator

Flash Attention 2 Integration Reduces Lit-GPT Runtime by 11%

My colleagues already added Flash Attention 2 to our Lit-GPT repo!
So if you are working on the NeurIPS LLM Efficiency Challenge (for which Lit-GPT is the official starter kit, https://
llm-efficiency-challenge.github.io), you can shave ~11% off your total runtime

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *