AI Dynamics

Global AI News Aggregator

About

DeepSeek’s DeepGEMM Update: Developers Control Hardware Optimization for fp8_mqa_logits

Looks like DeepSeek is handing hardware optimization control directly to developers in the latest DeepGEMM update. For the fp8_mqa_logits function, the weights tensor dtype now explicitly dictates the accumulation precision.

→ View original post on X — @jiqizhixin