So if you're using torch.compile you're already using a lot of triton under the hood, afaik PyTorch picks and chooses whether to call cuda kernels or triton for different ops / settings. Triton is really awesome, but of course you're staying in the Python / torch universe. Which
torch.compile uses Triton kernels under the hood for optimization
By
–
Leave a Reply