So @internetarchive scanning books for their digital library is copyright infringement:
http://blog.archive.org/2023/03/25/the-fight-continues/
But OpenAI slurping all of that to train a model that then can generate text and put actual authors out of business (already happening with copywriters), is not.
Figures, there are no $billions of VC / corporate money behind Internet Archive, why would anyone want to support a public service, right? 🤦♀️
IA ≠AI, know the difference!
#OpenAI is not going to be able to prove that everything they slurped from Teh Intertubes for their model-training-needs was in fact put out there in a way that was *not* copyright infringement in the first place.
In other words, if @internetarchive scans a book and puts it out there, that "infringing" scan could (if OpenAI has access to it) be used to train a model, and that would *not* be infringement.
Totally reasonable, right? 🤡
#InternetArchive #Copyright