Jul 05 2023 Class-Action Lawsuit for Scraping Data without Permission

Source

I have mixed feelings about this class-action lawsuit against OpenAI and Microsoft, claiming that it “scraped 300 billion words from the internet” without either registering as a data broker or obtaining consent. On the one hand, I want this to be a protected fair use of public data. On the other hand, I want us all to be compensated for our uniquely human ability to generate language. There’s an interesting wrinkle on this. A recent paper showed that using AI generated text to train another AI invariably “causes irreversible defects.” From a summary : The tails of the original content [...]