Email or username:

Password:

Forgot your password?
Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

Hello #WritingCommunity
Apparently #Google has changed their privacy policy and now says that they'll scrape everything you post online to train their AI tools.
I even post my #Fiction online on #Substack & my #Wordpress blog and now wonder if this is a bad idea.
They say paywalls could deter the scraping.
What do you think writers can do to protect their content? Or should we just roll over and accept that this is the way things will be from now on?

gizmodo.com/google-says-itll-s

36 comments
Calamity Caitlin

@liztai is this even private documents? This makes me so goddamn angry.

Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

@squeevening apparently paywalls may deter them, but that's theory at this point. I honestly don't think anything will escape them.

Calamity Caitlin

@liztai but I *draft my novels* in google docs. Time to find new editing software, seems like.

Cozy

@liztai @squeevening There’s this concept of the 5Ds, are you familiar with it? Bottom line is they can’t steal what they can’t trust, or if they do steal it anyway they’ll suffer a loss of internal integrity. This is the same line of reasoning as to why proliferation of β€œAI generated content” will be poison to future training of new capabilities. wired.com/beyond-the-beyond/20

Cozy

@squeevening @liztai Well I was thinking more along the lines of adding a β€œlearn more” link at the end of each page and filling it with content generated by a llama model, chatllama.baseten.co/ or similar.[Edit: I thought about it a little more, no plugin required] Or encipher the body of things you post and include a little javascript button that deciphers the text for humans. I’m not sure if a simple cipher like rot13 would be solved by the model before being stored away or if it would encourage it to randomly answer questions with an unexpected caesar cipher. :ameowbongo:

@squeevening @liztai Well I was thinking more along the lines of adding a β€œlearn more” link at the end of each page and filling it with content generated by a llama model, chatllama.baseten.co/ or similar.[Edit: I thought about it a little more, no plugin required] Or encipher the body of things you post and include a little javascript button that deciphers the text for humans. I’m not sure if a simple cipher like rot13 would be solved by the model before being stored away or if it would...

Calamity Caitlin

@Cozy @liztai Let me read this again after a really good sleep. Not smart enough right now. πŸ˜‚πŸ˜‚πŸ˜‚

Karawynn Long β™ΏπŸ˜·πŸ³οΈβ€πŸŒˆβœπŸ»πŸ¦Š

@Cozy @squeevening @liztai

the problem i see with that solution ("solution") is how icky it makes the reading experience for actual people. that's not something i'm willing to do.

Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

@SeaFury I'm not techy enough to know, but apparently yes, you can do it that way but that means you can't use RSS and your content will be walled from readers.

SeaFury πŸ³οΈβ€πŸŒˆπŸŒ¦οΈ

@liztai 😭 RSS is useful. I’d be inclined to hide behind a membership wall. But that limits new readers who might browse past.

Siderea, Sibylla Bostoniensis

@liztai

I am a techie, and a writer, and a robots.txt does not prevent rss. I have a robots.txt on my blog and it has perfectly working rss.

What a robots.txt does do, though, is exclude your site from Google Search.

What I don't know, because that article made no sense and I haven't investigated further, is whether this change in policy at Google means they are no longer respecting robot.txts. I have no idea why their privacy policy would have anything to do with that issue, as their privpol is for their users and definitionally people with robots.txts are declining to be their users.

@SeaFury

@liztai

I am a techie, and a writer, and a robots.txt does not prevent rss. I have a robots.txt on my blog and it has perfectly working rss.

What a robots.txt does do, though, is exclude your site from Google Search.

What I don't know, because that article made no sense and I haven't investigated further, is whether this change in policy at Google means they are no longer respecting robot.txts. I have no idea why their privacy policy would have anything to do with that issue, as their privpol is...

Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

@siderea @SeaFury Yeah that's the problem - a lot of people are wondering if they'll just ignore the robots.txt

R. Nicole

@liztai Paywalls and theft of copyright materials/your hard work is never ok.

This is why we need legislation that addresses these issues and tells big tech, "No! You cannot steal from creators! If you want it, you need to *pay for it!"

Trajecient

@appagalcrochet @liztai Not merely a matter of paying creators but obtaining permission.

If say, I had a photography blog and portraits were being scraped I wouldn't want to be paid against my will if I was aware the consent of one or more subjects to such use was in question.

And there may be creators who wouldn't want to be paid if they felt too uncomfortable by the potential use, such as in the case of more controversial applications of such datasets.

R. Nicole

@Trajecient @liztai My point was, you would have to agree to a price/sale of goods or have the choice to refuse.

James :fukushima:

@liztai

Thanks for the heads up. I will be taking down my microfiction blog as of today.

James :fukushima:

@liztai Yes. Yes it is. They've probably already scraped it anyway and now what I made for people to read for free is going to be used to make money for someone else. What even is the point.

Wendell Bell

@liztai I would counsel against capitulation, but, as with music files in the digital era, you won't be able to be utterly leakproof.

Viking Chieftain

@liztai - Long gone are the days when Google lived by the goodwill motto "Don't be evil".

Greg Stolze

@liztai Every time you post something meaningful, also post nonsense with an equal word count?

Diana Lloyd

@liztai Yikes! Thinking of the book excerpts on my website and wondering if there’s any sort of language I might include to prevent this. Something like I do not give permission for my words to be used in AI applications without my express permission. Is that dumb?

Julio Jimenez

@liztai hah, the joke’s on them, everything I post is absolute garbage, code included.

Ellie Renae (writer)

@liztai I may have a workaround for #writers, #Gumroad!

It has a $0+ setting so people still get your work for free, but bots would have to choose their price, enter an email address, click the link inside, and download in order to scrape the text. Like a false paywall! Hopefully this will protect my #writing, and maybe it can protect others too?

I'm not against all uses of #AI, but I'm very much for consent and compensation, and it's hard to fend off the "requests" of a massive tech oligopoly

Bill (he/him)

@liztai If they want to train on my bad spelling and grammar, have at it.

But I can see a long legal battle brewing out on the high seas

Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

@bllgvn The gall of them thinking that they can get away with it. The EU will drag their ass and make their life hell. The US on the other hand seem to roll over ...

DELETED

@liztai Unfortunately, I don't think we have much choice. I'm not too worried about AI, though. People will eventually realize that whatever is created via AI has no soul or beauty.

DELETED

@liztai Everything is cyclical. Human beings need to be affected by something to realize the errors of their ways. ;-)

Eduardo SΓ‘nchez

@liztai Perhaps, in your blog, limitingit via robots.txt ?

Dendan Setia (Nins)

@liztai
Idk about paywalls but srsly considering making my blog pw-protected. Haih.

Elizabeth Tai | ζˆ΄η§€ι“ƒ πŸ‡²πŸ‡Ύ

@cendawanita IKR. It seems like big tech seems determined to make everyone hate them this year. Every platform is going nuts over AI and as a result, screwing over the users to get that stupid pot of gold.

Techviator

@liztai
Maybe something like the PDA access restriction plugin could be implemented to add a Paywall but only for Google IPs, that should allow their indexers to find tour website and titles, but limit how much content they can access... maybe a new plugin will have to be created, but technically it is possible.

Go Up