Agreeing and extending @sashamtl
, there are lots of reasons why OpenAI likely isn’t open about training data: copyright, bias, and also questions about generalization & data contamination. I too salute @abebab
’s great detective work, including an important new paper coming soon.
OpenAI Training Data Transparency: Copyright, Bias, and Generalization Issues
By
–