What if Diffusion LLMs learned logic more efficiently? Huawei's Noah's Ark Lab proposes a breakthrough: their "smart noise scheduler" uses priority masking to focus training on only information-dense data, making DLLMs master core reasoning and structure. This boosts average accuracy by 4% on Code & Math reasoning, beating uniform baselines and unlocking DLLM potential. Mask Is What DLLM Needs: A Masked Data Training Paradigm for Diffusion LLMs Paper: arxiv.org/abs/2603.15803 Dataset: huggingface.co/datasets/malr… Our report: mp.weixin.qq.com/s/1yTd36hev… 📬 #PapersAccepted by Jiqizhixin
→ View original post on X — @jiqizhixin, 2026-04-05 14:05 UTC

Leave a Reply