Mitch Effendi (ميتش أفندي)

@ernie you know, I have to wonder if the inaction on prosecuting LLM training companies actually introduced a legal loophole for libraries.

Consider that right now, the American legal standard is that GenAI output is considered a derivative work, even if it derived it from 30 billion works. I wonder if the Internet Archive "chunked" editions of books together into a specialized model, could they then "loan" the book out by inferencing a near exact but legally 'distinct' copy of that work?

Like 28 Jun 2024 at 20:50 | Wall-to-wall | Open on posts.dumb.stuff.donaberger.xyz

9 comments

@ernie after all, coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point. We know it's possible — it's how the tech started, by being able to reproduce a ground truth image despite never having actually been exposed to the original file.

I dunno, I'm just some idiot online.

28 Jun 2024 at 20:53 | Open on posts.dumb.stuff.donaberger.xyz

Ernie Smith

@mitch Just thinking of how Aaron Swartz—how his good intentions got exploited by people looking to make maximum $$$ and an org actually working in his spirit is being threatened

28 Jun 2024 at 20:56 | Open on writing.exchange

Mitch Effendi (ميتش أفندي)

@ernie I miss Aaron a lot, we used to chat through Reddit comments way back when. I'm very glad that he didn't have to see all this, let alone what Reddit and Steve Huffman both turned into.

28 Jun 2024 at 21:00 | Open on posts.dumb.stuff.donaberger.xyz

DELETED

@mitch @ernie Or they could create the Library of Babel and publish every book that ever was or will be written (for a given alphabet and character limit).

28 Jun 2024 at 21:56 | Open on mastodon.social

𝘕𝘦𝘤𝘳𝘰𝘍𝘳𝘪𝘦𝘥𝘪𝘶𝘴

@mitch

@ernie
there's already ebook DRM that works like this 🤣

29 Jun 2024 at 4:48 | Open on dice.camp

feld

@mitch @ernie

> coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point.

If that was the case Sarah Silverman's lawsuit wouldn't be going so poorly. They even avoid claiming it can produce the reference works.

29 Jun 2024 at 13:40 | Open on bikeshed.party

Hypolite Petovan

@mitch @ernie “Inaction” is inaccurate, publishers are exploring lawsuits, but you have to prove the use of copyrighted material in the training of LLM systems which isn’t straightforward unless the company straight-up admits it.

It’s comparatively easier for Hachette to sue the Internet Archive since they explicitly reference individual books.

29 Jun 2024 at 3:14 | Open on friendica.mrpetovan.com

A. P. Howell

@mitch @ernie Libraries do have additional rights/protections under US copyright law.

29 Jun 2024 at 4:38 | Open on wandering.shop

Jared Davis

Such a model clearly ought to be named “Borges”

@mitch @ernie

29 Jun 2024 at 12:54 | Open on mathstodon.xyz