Email or username:

Password:

Forgot your password?
Ernie Smith

The irony is not lost on me that the Internet Archive went out of its way to acquire the physical versions of millions of books and loan them out carefully and in a limited way, and is facing a near-extinction-level event over it, while for-profit and VC-backed companies are just stealing people’s content and making up excuses to validate the bad behavior.

118 comments
mybarkingdogs

@ernie Copyright is a tool of those rich enough to enforce it #AbolishCopyright

Agarwal's Twitter Dream

@mybarkingdogs @ernie Abolish it moreso than what it is maybe? I could buy that ideology

Emmy - Dial Tone *biiiiip*

@Agrawals_1_Twitter_Dream @mybarkingdogs @ernie I'd be happy to start with disallowing any form of bargaining away or transferring an artist's right to their own work for any reason, for all art, including newer media like video games.

Start with.

Mitch Effendi (ميتش أفندي)

@ernie you know, I have to wonder if the inaction on prosecuting LLM training companies actually introduced a legal loophole for libraries.

Consider that right now, the American legal standard is that GenAI output is considered a derivative work, even if it derived it from 30 billion works. I wonder if the Internet Archive "chunked" editions of books together into a specialized model, could they then "loan" the book out by inferencing a near exact but legally 'distinct' copy of that work?

Mitch Effendi (ميتش أفندي)

@ernie after all, coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point. We know it's possible — it's how the tech started, by being able to reproduce a ground truth image despite never having actually been exposed to the original file.

I dunno, I'm just some idiot online.

Ernie Smith

@mitch Just thinking of how Aaron Swartz—how his good intentions got exploited by people looking to make maximum $$$ and an org actually working in his spirit is being threatened

Mitch Effendi (ميتش أفندي)

@ernie I miss Aaron a lot, we used to chat through Reddit comments way back when. I'm very glad that he didn't have to see all this, let alone what Reddit and Steve Huffman both turned into.

DELETED

@mitch @ernie Or they could create the Library of Babel and publish every book that ever was or will be written (for a given alphabet and character limit).

🌸💪Big🐐Tuff🦖Al🔪🌸

@mitch

@ernie
there's already ebook DRM that works like this 🤣

feld
@mitch @ernie

> coaxing an LLM to reproduce a reference work basically in full is pretty established research at this point.

If that was the case Sarah Silverman's lawsuit wouldn't be going so poorly. They even avoid claiming it can produce the reference works.
Hypolite Petovan

@mitch @ernie “Inaction” is inaccurate, publishers are exploring lawsuits, but you have to prove the use of copyrighted material in the training of LLM systems which isn’t straightforward unless the company straight-up admits it.

It’s comparatively easier for Hachette to sue the Internet Archive since they explicitly reference individual books.

A. P. Howell

@mitch @ernie Libraries do have additional rights/protections under US copyright law.

Jared Davis

Such a model clearly ought to be named “Borges”

@mitch @ernie

the Amygdalai Lama

@ernie @mitch
.
I mean it’s so much easier than physically burning libraries - but the work has never deterred them anyway. I’ve always known my blog and most of my speech was going to disappear the minute the power went out. Think of the trees we’ve saved. ❤️🙄

Wiredfire

@ernie while I totally agree from a moral pint of view, IA knew they were flaunting rules and got sloppy / arrogant about it. While the copyright system BADLY needs reform IA brought this world of pain onto themselves knowing that system very well and should have known better.

For me it raises a pressing matter of who archives the archive..? We need redundancy of such important services to protect them against catastrophe be that technological or bureaucratic.

Wiredfire

@ernie what IA were doing may have been morally right but we all know morality doesn’t come into copyright law 😒

Ernie Smith

@wiredfire I think IA pushed the edges of the program a little too hard during the pandemic but the idea of checking out digital forms of physical books is how things should work. It’s infuriating that it does not

Wiredfire

@ernie absolutely, it’s maddening. In a similar way I have a stack of physical books but I just don’t read them, I always pick up the Kobo instead. There is no legal route for me to simply format-shift them to digital. Of course there are plenty of routes I could go to furnish myself with digital copies, but still.

Wiredfire

@ernie There was a service that tried to do this for a while, you had to write a message in the book and send photo as evidence and it used that to determine one book = one ebook. Naturally publishers hated this idea and the whole thing fell apart fairly quickly 😠

Jessamyn

@wiredfire @ernie I have never heard of that "write a message in the book to prove ownership" thing and as a librarian who has also worked at Open Library, I am very curious, Do you remember what it was called?

Wiredfire

@jessamyn @ernie it was called “Shelfie”. All gone and burried now but this is an article that touches on how it worked from the end user point of view:

time.com/4146187/shelfie-app-f

Death by Lambda

@wiredfire @ernie morally right or morally evil?

If the former, please help me understand how AI has been morally trained?

Riley S. Faelan

@wiredfire *hiss*! *boo*!

You're pretending that the "rules" are some sort of objective, knowable, and respectable thing? *hiss*!

:blobcatglare:​

@ernie

DELETED

@ernie

Our courts are worthless

Our system is worthless

Defending it in any way is now approaching evil

Frances Larina

@ernie

That's not irony. It's capitalism and corruption. It's a system that at its core does not work for the people.

Craig Nicol

@ernie do you think OpenAI could have got its corpus if text without Internet Archive (and project Gutenberg).

The tragedy of the commons is that some mobster always wants to profit from it at everyone else's expense.

D. Schmudde

@craignicol @ernie “tragedy of the commons” was more about resources running out. The problem with the idea is that common spaces generally functioned well until property ownership and capital changed the equation.

In other words, human fictions like copyright ruin commons, not some natural law.

Rob Hooft

@ernie some organizations make enough money by violating the law that they can sue everyone else to extinction.

tom jennings

@ernie @jonny

I honestly think that if IA's intent were profit they'd have an easier time arguing.

But we'll see what the judges do this fall...

Lynn McAlister UE

@ernie If something is done with the intention of making things better for everyone, it's a crime. If something is done to make a profit, despite damaging everyone, it's considered virtuous.

Angie 🇵🇸🇺🇦

@ernie

The decision they made during COVID to not limit loans to what they had on hand was a well-intentioned but bad decision.

Jonathan Kamens

@ernie That's what they started out doing. And if they had kept doing that, they probably never would have been sued, or if they were, there's a good chance they would have won.
But then they started entering into agreements with libraries all over the country: let us integrate with your online card catalog, and then we can let people check out one digital copy of any book your currently have on your shelves. Qualitatively different, much more questionable in terms of copyright law.
(continued)

Jonathan Kamens

@ernie And then COVID happened and they removed all borrowing limits and let anyone check out any book, infinite copies. And that's when the publishers noped out, decided they had gone too far, and sued.
What they did during COVID was clearly theft and clearly deprived authors of income. There was no ambiguity. It was a shitty, stupid overstep.
They got sued and established a shitty precedent because they went too far and pissed off the publishers too much. It was dumb.

Ernie Smith

@jik It wasn’t a shitty, stupid overstep. It was a well-intentioned overstep at a time when people couldn’t go to the library.

Was the result damaging and ultimately out of bounds? Yes. But I absolutely draw the line at calling it dumb. The publishers moved out of bounds too here—they didn’t have to go after the whole thing. That was their choice.

Let us not lose context here.

Jonathan Kamens

@ernie Ah, so now we're moving the goalposts.
You knew when you posted your first post above what IA _actually_ did, but you chose to describe it inaccurately so you could make IA look better and the corporations who sued them look worse than actually justified by the circumstances?
And then when called out on it, you came back with "well ok, it actually wasn't as bad as I said, but lord, it wasn't good."
Yeah, no.
I don't waste my time here with people who pull shit like this.
*plonk*,,

DELETED

@jik @ernie
The publishers stand fully within the bounds of the protection of their copyrights.

Fully scanning, storing electronically, and distributing a work protected by someone else's copyright is out of bounds of legality and fairness.

The Internet Archive pirated someone else's property in the name of a purported social cause or closeted authoritarian socialism. They held no entitlement to do so.
That is the context.

Ernie Smith

@AndersBaerbock

This is where I stand on this. And all I was honestly trying to say to @jik before he kind of went left. I wasn’t trying to piss him off, just to say the program was to some degree defensible.

writing.exchange/@ernie/112696

Ernie Smith

@AndersBaerbock @jik You clearly disagree, but this is my view. The Internet Archive should be able to scan books and lend them. Fair use should extend in this way.

DELETED

@ernie @jik
If you think that fair use should extend in some specific way, then you should submit your notion to the test to actual democracy.

If most people, after a healthy discussion, agree with you, your idea will become incorporated as law.

If democracy is nowadays broken, IMHO is better to fix it rather than embracing the law of the jungle like the IA attempted.

DELETED

@ernie @jik
Digitalizing a physical copy of a copyrighted book is not defensible.

I explain: When you buy a printed book, you merely buy a copy of the literary work, but you do not buy the work itself.

So, becoming owner of a printed copy of a book does not grant you the right to digitalize it.

Ernie Smith

@AndersBaerbock @jik I disagree. That frames the content owner as having fewer rights in a digital realm. Fair use should bend more than that.

Libraries should work exactly the same in the digital era as they do the physical one. It's the belief that they're somehow different feels like a warping of the library’s intent, as well as the intent of fair use

It's possible the case in front of the court may go either way on this point.

DELETED

@ernie @jik
«That frames the content owner as having fewer rights in a digital realm.» False: the content (copyright) owner can also sell you a digital copy of the literary work —if actually wanted to.

And exactly like in the physical realm, the copyright owner must explicitly allow you to make more digital copies and distribute those.

Regarding fair use: As I said, actual democracy may serve as the test in a civilized world; not the law of the jungle.

Have a nice weekend.

@ernie @jik
«That frames the content owner as having fewer rights in a digital realm.» False: the content (copyright) owner can also sell you a digital copy of the literary work —if actually wanted to.

And exactly like in the physical realm, the copyright owner must explicitly allow you to make more digital copies and distribute those.

Ernie Smith

@AndersBaerbock @jik You totally misread what I said. The content owner is the person who bought the object. I specifically did not say copyright owner.

DELETED replied to Ernie

@ernie @jik
Thanks for saying.
Anyway, the rights are the same both in the physical and digital realms:
When a person buys a copy —whether digital or physical— of a literary work, that act does not entitle the person (or organization) to make more copies —whether digital or physical— and distribute them —whether digitally or physically.

Bye! 2nd time, and last.

@ernie @jik
Thanks for saying.
Anyway, the rights are the same both in the physical and digital realms:
When a person buys a copy —whether digital or physical— of a literary work, that act does not entitle the person (or organization) to make more copies —whether digital or physical— and distribute them —whether digitally or physically.

Scott D. Strader

@ernie omg I hate to be the lazy non-reseacher, but... I knew of the issue with loaning books, but does "a near-extinction-level event" mean the entire site is under threat?!

Ernie Smith

@sstrader When you have to pay massive legal bills for a long-running case, it doesn’t do amazing things for your financial situation.

Ernie Smith

@sstrader I think this NYT story really does a great job of highlighting just how dangerous this has all been for the archive.

nytimes.com/2023/08/13/busines

Scott D. Strader

@ernie ahhhhh, duh. <he says as he goes to up his monthly donation>. Internet Archive is one of the few sites from the Halcyon days of early tech optimism that truly has lived up to what we all dreamt of.

Goddamn these corporations

Ji Fu

@ernie both of these are acceptable. Intellectual "property" doesn't exist. You can't steal an idea.

Ernie Smith

@Fu Copyright law would disagree with you

Ernie Smith

@Fu I think copyright law is super-broken, don’t get me wrong, but it’s not quite as far as intellectual property not existing.

I think both of these things cause people to be disillusioned by copyright law

DELETED

@Fu @ernie

FYI:
«always keep in mind that copyright protects expression, and never ideas, procedures, methods, systems, processes, concepts, principles, or discoveries.»
#copyright
copyright.gov/what-is-copyrigh

DELETED

@ernie this just in: predetory systems are abusive and ruin everything; the internet is a bad place and people are irredeemable. I feel like i've been hearing literally the same thing over and over and over said with different words since the internet began and since i met people for the first time.

Jörg Seidel

@ernie
You have to understand that. The Internet Archive is evil. Worse, it's non-profit.
@jbz

GhostOnTheHalfShell

@lostgen @ernie @jbz

I think that sarcasm eroded part of my screen.

The way the publishers are going, once they own it all, they’ll delete it.


@ernie

You know why lobbyists are expensive?

Because they're worth it!
DELETED

@ernie
Libraries don't feed the bottom line. Its why they are under attack and Vulture Capital gets a pass.
The other thing though is that archives hold direct copies of information and intact history that cannot be easily modified to reflect new mandates from the Ministry of Truth.social. With Generative AI history is a matter of probability.

Ernie Smith

@hipsterelectron

“Near extinction-level”

1) The proposed fines in the case ($150,000 per title) would have bankrupted the organization, though it was eventually narrowed down to 127 titles
2) Legal filings are expensive
3) They’ve had to remove a lot of content already

Sci-Fi Girl

@ernie @hipsterelectron

Are the archives of internet pages in danger too? (Genuinely wondering.)

ari 💫

@ernie i don't really consider myself a radical activist, but every once in a while i read about stuff like this and reconsider my stance

Adlangx

@ernie we will just pirate them. It will be fine.

BlueBee

@ernie

That's because the truth has always been and always will be.

Money = Power

Tricot Feelya

@ernie archive.org has really tightened its access policy. To read many works now you have to be certified vision disabled.

GolfNovemberUniform

@ernie they are just bad people/companies. What else did you expect?

ECHAEA

@ernie
When it's for profit, it's OK 🤷‍♂️

Pablo Martini

@ernie I foresaw this in part @ withdrew loads of my Images, purging the data (hopefully) some of my work curated by gits at bbc! got bought as part of Hulton by Getty (oil magnate fame!) images.
one out all out!

wraptile

@ernie copyright has been a broken system since the very inception.

Robert

@ernie The lesson is: Instead of investing donations into purchasing copies, they should have invested it into legal, like most companies seem to do

Prastowo Yustinus

@ernie@writing.exchange Not problem, use Anna archive instead, stop use internet archive for reading books because site admin and team will remove stuff if received message from stingy companies who want to destroy old stuff for profit.

https://annas-archive.org/

Dan O'Neill

@ernie back in the early 2000's MP3.com paid for physical CD's, made digital copies available to stream, but only if you could prove via a sophisticated hashing algorithm that you had your own physical copy at your end. The security of the "lending" didn't matter, it was the copying and subsequent distribution.

One has to lobby Congress to change Copyright law. Is Lawrence Lessig writing anything about IA's position?

Ernie Smith

@dkoneill One gets the feeling that trying to convince legislators of the importance of this would be immensely fraught in part because the advocacy groups are much more established on the copyright-holders’ sides.

I was just thinking about the MP3.com case. The fact that isn’t how copyright law works is wrong; it was clearly an idea ahead of its time.

Rob McKenna

@ernie but they are making profits silly! Delivering value for shareholders! Totally different to providing a useful service for not money.

coffe☕

@ernie We need to stop talking about everything we have to do. We need to act now.

We already live in an information economy dystopia.

Those with enough power and money to hide behind wealth and corporate structures can freely exploit our private data, while those aiming to serve the public interest are immediately punished

bitsavers.org

@ernie

We also know now what the real motivation behind the Google Books project was. To provide a training base for their AI

Archnemysis

@ernie ok, but internet archive did not buy the rights to electronically distribute those books, no matter how careful or limited that electronic distribution is. ALL they bought was the paper someone else’s work was printed on. That someone else has a right to determine how their work is distributed.

Kevin Karhan :verified:

@ernie except the former one did loan more copies than they had and the latter one isn't stealing because if that was copyright infringement, we'd all be perpetual #DebtPeons starting in Kindergrdtten or Elementary school...

Alexandre Oliva
your post gave me the following idea:
the archive should train some LLM on all of those books, and then publish the trained model.
who'd want to borrow the books under DRM if they can have a locally-running LLM that can search, summarize or even "write" them on demand?
crossing these rays would pit the LLM giants against the book MAFIAA. in such a fight, we should all be rooting for the fight, but if it brings LLM giants to defend the Internet Archive, that could be good?
cc: @brewsterkahle
your post gave me the following idea:
the archive should train some LLM on all of those books, and then publish the trained model.
who'd want to borrow the books under DRM if they can have a locally-running LLM that can search, summarize or even "write" them on demand?
NeonkAaa

@ernie
Sue them. Why it's about sitting as always?

Cass (they/them)

@ernie Not only this but the big publishers and online retailers make a lot of money selling people books that they can pull at any time because what’s being sold isn’t the books themselves but digital access to them. It shouldn’t be lost on anybody that they are doing this with the form of books that are most accessible to blind and otherwise print impaired people, which locks us into having to use that shitty model more than others.

MyMimir

@ernie this proves true on so many levels and should be hammered into ppl‘s heads

Kartik Agaram

@ernie Laws are for lawyers. IP needs to be armored by lawyers.

sleepfreeparent

@ernie the stranglehold that publishers have on our entire culture is obscene. We have to find ways to support authors that don't further empower these ghouls. Meanwhile, z library doesn't bother paying publishers anything, and has books and scientific articles freely available to all, and has yet to be taken down.

TOR URL:
loginzlib2vrak5zzpcocc3ouizykn6k5qecgj2tzlnab5wcbqhembyd.onion

Go Up