AI Dynamics

Global AI News Aggregator

TorchTitan MoE Implementation with FSDP and Torch Requirements

torchtitan has an MoE impl that supports grouped mm and composes with FSDP: https://
github.com/pytorch/torcht
itan/blob/main/torchtitan/models/moe.py
… needs the latest torch version though (2.8) which flash-attn doesnt have a wheel for yet 🙁

→ View original post on X — @jxmnop,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *