AI Dynamics

Global AI News Aggregator

About

ODB-dLLM: Faster Parallelizable Diffusion-Based Language Models

Diffusion-based LLMs are fast and parallelizable, but bidirectional attention makes inference expensive due to repeated prefill and decoding. Enter ODB-dLLM, a dual-boundary framework with adaptive prefill length prediction and dLLM-specific jump-share speculative decoding.

→ View original post on X — @jiqizhixin