Email or username:

Password:

Forgot your password?
William Pietri

This is glorious.The best time to burn a bridge is when you never, ever want to cross it again. #genai (Edit: This is a notice explaining why somebody is shutting down a long-running project to measure word frequencies.) github.com/rspeer/wordfreq/blo

40 comments
T. T. Perry

@williampietri

Robyn Speer is a treasure and she makes great points.

Rosco

@williampietri "I don't think anyone has reliable information about post-2021 language usage by humans."
Wow. No words could describe the feeling I got reading that.

Nat Pryce

@rosco @williampietri Give it a few years and humans will talk like the AI-generated slop they see on the Internet. Look how shorthand invented to save time typing on 1990s mobile phone number pads became used in everyday speech.

William Pietri

@natpryce Could be! Although I could see it going the other way, people leaning into slang and other human-driven linguistic evolution so that they are more likely to feel real. And perhaps we'll see a fair bit of both.

@rosco

unusual zone of infecundity
@williampietri @natpryce @rosco that's not too huge a worry, people are always surprisingly* good at communicating complex ideas through limited or inappropriate vocabulary

but the demand to conform to the informational and epistemic meta-structures of how AI and corpo web etc. handle what is true and what deserves attention is going to do enormous amounts of damage

*assume an intelligence roughly equal to one's own, which you thereby cannot comprehensively model, but trained on often significantly different foundational experiences. it's impossible not to be eventually surprised
@williampietri @natpryce @rosco that's not too huge a worry, people are always surprisingly* good at communicating complex ideas through limited or inappropriate vocabulary

but the demand to conform to the informational and epistemic meta-structures of how AI and corpo web etc. handle what is true and what deserves attention is going to do enormous amounts of damage
publius

@natpryce @rosco @williampietri

For a while, telegraphers' and radio operators' shorthand expressions crossed over into general use.

Duncan Blair

@rosco @williampietri reminds me of how carbon dating suffered after nuclear tests in the 1950s

eviloatmeal
@williampietri Good for them. Sorry their passion project was ruined by slop.
flere-imsaho

@MichaelPorter of all things, delve is the most human artifact of the llms – it's a result of employing nigerian english speakers. @williampietri

Michael Porter

@mawhrin @williampietri If that’s true, maybe there’s hope for the LLMs yet 😊

v̾i̾t̾r̾i̾o̾l̾i̾x̾

@williampietri Tldr: lib that scrapes human language is going to stop maintenance because the web is polluted with ai slop

Mister Moo 🐮

@williampietri "Reddit also stopped providing public data archives, and now they sell their archives at a price that only OpenAI will pay.

And given what's happening to the field, I don't blame them."

I do. Reddit's mid-2023 moves were, and still are, disgraceful and have resulted in a reduction of my usage by 99%. (Yes, I know I'm the only one.)

ಚಿರಾಗ್ 🌹✊🏾Ⓥ🌱🇵🇸 (he/him)

@MisterMoo @williampietri You're not the only one. When they killed third-party apps, I stopped going to Reddit.

Alex Ball

@chiraag @MisterMoo Same! I didn’t even use a third-party client, but shutting them out was the reason I walked away.

spooky blip 👻

@chiraag @MisterMoo @williampietri there's three of us. I deleted all my data and my account. I still end up with a bunch of Reddit access via proxy frontends via LibRedirect, because Reddit is still the central forum of the internet where everything that doesn't live on Twitter or Facebook/Insta lives, and I often want to read forum-style content, but... It's read-only for me, and via a frontend.

David Nash

@MisterMoo @williampietri Yet another one. Reddit (outside of old.reddit.com) was really only tolerable with third-party apps and when Reddit effectively killed those off, I deleted my account and much of what I wrote there (which already wasn’t a lot). I’ll still look at plausible search results from Reddit as long as that’s still a useful trick, but that’s about it.

DC Rat

@MisterMoo @williampietri I really hope somebody leaks the twitter database. The use of that dataset in Because Internet is amazing.

Jake Rayson

@williampietri

> Now the Web at large is full of slop generated by large language models, written by no one to communicate nothing.

👏

priryo

@williampietri
"written by no one to communicate nothing" is poetry.

Cluster Fcku

@williampietri "OpenAI and Google can collect their own damn data. I hope they have to pay a very high price for it, and I hope they're constantly cursing the mess that they made themselves." ... they won't see it that way, but come up with justifications that change laws and public opinions. The goal is to circumvent any privacy and moral guards, and further manipulate our emotion-hormones to get us to divulge what they want, and have us pay for it by giving up our cherished common good.

janet_catcus

@williampietri smells like another ai winter... except it's the other way around, it's not that nobody talks about or does ai but that nobody talks or does anything because of ai

Esther Payne :bisexual_flag:

@williampietri that was a truly beautiful goodbye.

Its really sad she had to do this.

DELETED

@williampietri
That was pretty haunting and almost painful to read. I could feel some of her pain in the words and it really concisely brings home how much more damage GenAI has done.

Thank you for sharing!

Tom Walker

@williampietri It is terribly sad that the largest collection of human writing ever assembled is now polluted in such a way

Ross McKay

@williampietri this doesn't bode well for the future of our language. Already heavily influenced by corporate speak via advertisers and US TV/movie culture, AI-generated slop isn't going to help.

Bring on the next generation of word artists please: poets, writers, songwriters. I hope lots of 'em don't lean on ChatGPT et al.

Xenotime (formerly Residual Entropy)

@williampietri

The field I know as "natural language processing" is hard to find these days. It's all being devoured by generative AI.

Seriously, NLP is a cool area with all kinds of cool stuff, but all of the actually clever ideas and actually useful applications have faded away. It’s like when something super bright gets in frame and the camera auto adjusts and now that’s the only thing anyone can actually see.

Aaron

@williampietri That was a great, but honestly heartbreaking, read.

If there's anything I like it's the fact that "slop" seems to be sticking as the name for what GenAI produces.

Go Up