Interesting read so far! The bits about datasets vs. datasets indices could use clarifying. LAION's dataset *index* is non infringing, but if you include the scraped works of course it is (as per Berne).
Clarification needed on datasets versus dataset indices and copyright infringement
By
–