You’re misreading me — I’m saying ChatGPT data isn’t plausibly the source of DeepSeek-V3’s performance. The base model itself is good (as seen in its Pile-test scores, which match Llama 3.1 405B) and you can’t achieve that by training on ChatGPT.
Debate over ChatGPT data as source for DeepSeek-V3 performance
By
–