You're joking, but there's already 30M to 50M records with licenses (including Creative Commons) and they weren't seriously trying to collect license information. If they actually tried, they'd probably get to 10% 25% or more…
@alexjc
-
AI Data Scraping: Programmers Must Track Sources and Copyrights
By
–
You're looking for complexity where there is none. "AI" does not go out in the wild to scrape data, programmers implement code that does it, and thus can & should track the source and copyrights. If there is no clear & usable copyright information, then the code drops it…
-
Fair Use Should Be Limited to Individuals and Non-Profit Organizations
By
–
If Fair Use was limited to individual humans and/or strictly non-profit initiatives, I think much of the copyright debate would vanish immediately. No machines. No corporations. You have to wonder who benefits from the lack of clarity; it's been years in the making…
-

Internal Copies and Copyright Implications in Compression and Databases
By
–
You missed the link by @ninjadodo above then. There are indeed copies stored internally (arguably due to design flaws). It's quite like a compression algorithm, which also fall under copyright. Further, lossy databases are also regulated as databases.
-
Google Image Search loses Getty Images copyright lawsuit settlement
By
–
Google Image Search lost its lawsuit against Getty Images on copyright, and had to settle. I presume they did so to avoid setting a precedent with their defeat. There are now very strict conditions to follow in those cases… it's well regulated.
-
What Makes ChatGPT Different: Viral Adoption Over Benchmarks
By
–
It's what makes ChatGPT different that should be the focus, because it became viral and more widely used, people claimed it'd replace search engines! The original GPTs had nowhere near this buzz, despite the benchmarks/metrics/reasoning.
-
Galactica’s Training on Scientific Papers Improves Logical Reasoning
By
–
Agree, Galactica showed that training on scientific papers as well as code improved logical reasoning in comparison. Orthogonally, there's another hypothesis that the interactive format has value: whether it produces more value to users when used in a interaction loop.
-
Labeling Assumptions in Research: A Scientific Necessity
By
–
I'm fine with that if you label those as assumptions. There has been no comparative test of the abilities that are unique to ChatGPT because, by definition, they are only in ChatGPT and no good benchmarks exist. Dismissing a branch of research via assumptions is not scientific.
-
UK Copyright Law for Generative Art and Monkey Selfie Retroactivity
By
–
The Wikipedia page has been updated with reference to a new UK law on copyright for generative art, but can it apply retroactively to the monkey selfie?
-

Automated System Photo Raises Copyright Ownership Questions
By
–
An automated system took this photo. Cue the debate over whether it's copyrighted or not! 😉