AI Dynamics

Global AI News Aggregator

cuBLASLt and cuDNN Dependencies for Optimized Performance

Yes, cuBLASLt for gemms, cuDNN for flash attention
The fp32 version will become more educational and will delete these dependencies. The "mainline" version we just want to be really fast, so we're less discriminating. cuBLASLt I think is ~ok dep, but cuDNN turned out surprisingly

→ View original post on X — @karpathy,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *