The real question: What is the DeepSeek-R1 training cost? The $5.567M DeepSeek cost is missing the cost of training the R1 model to get distilled data. Table 9 in the DeepSeek v3 paper shows that the R1 distillation step is critical for quality. The R1 paper doesn't talk about
DeepSeek-R1 Training Cost: Missing R1 Distillation Expenses
By
–
Leave a Reply