Day-0 support for @deepseek_ai V4 Pro and Flash on vLLM — a new generation of DeepSeek model, purpose-built for tasks up to 1M tokens. Alongside the release, we're publishing a first-principles walkthrough of the new long-context attention and how we implemented it in vLLM. x.com/deepseek_ai/st…
DeepSeek V4 Pro Flash Day-0 vLLM Support Long-Context
By
–
