AI Dynamics

Global AI News Aggregator

Mojo Kernels: Reducing conv2d Code from 870 to 130 Lines

130 lines instead of 870. That's the difference between our conv2d implementation on Blackwell and CUTLASS's. We broke kernels into three swappable pieces: one for moving data, one for coordinating the pipeline, one for compute. When you need a new kernel, you only change the piece that actually needs to change. Part 3 of our Structured Mojo Kernels series walks through the details: modular.com/blog/structured-…

→ View original post on X — @jeremyphoward, 2026-03-27 15:00 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *