The code to reproduce the benchmarks yourself is in the link. To note, the native torch implementations used here are well optimized implementations that are widely used. If you compare "plain" implementations (e.g. research code) it's not unusual to see a 5x speedup. I get the
Reproducing Benchmarks: Torch Implementations and Performance Comparisons
By
–
Leave a Reply