An updated back-of-the-envelope calculation of LLM pretraining costs based on the just-released DeepSeek-v3 report.
And that doesn't even account for hyperparameter tuning, failed runs, or personnel costs. It really makes me appreciate the value of openly shared model weights!
LLM Pretraining Costs Calculation Based on DeepSeek-v3
By
–
Leave a Reply