AI Dynamics

Global AI News Aggregator

About

Torch.compile() Outperforms Triton for Kernel Optimization

It's interesting that this "kernel" is designed for torch.compile(), so it's Python code but turns out faster! Conversely, some of the other Liger kernels are Triton and I measured them as slower than torch.compile'd versions.

→ View original post on X — @alexjc,