Email or username:

Password:

Forgot your password?
R E K

Lately, while doing general research, I often land on webpages that are hosting erroneous generated content.

A site about how to fill a pillow looks legit at first, then lapses into how to make filling for food.
A site about boating uses fake people to write articles about boating, talking about how to make trips they've never done or never plan to do (obvious from the writing).

It is gonna get harder and harder to find good information out there.

25 comments
Devil Lu Linvega

@rek How To Stuff Your Pillow With Turkey Stuffing Recipe

wait, wat

R E K

I'm going to start building myself a list of worthwhile trustworthy sources(for everything) that I know are written by experienced humans because this is... getting out of hand.

Lizbeth

@rek yeah I've been thinking about this too, and search engines aren't as helpful as they used to be either with all the focus on selling products. Vetted online "libraries" of trusted info would be really good to start putting together

R E K

@ritualdust indeed it would. If you make a list do share it ^_^!

Lykso

@rek If you don't mind, could I piggyback on your efforts? (Do you plan on making this list public anyway?) I'm unlikely to put time into constructing such a list myself (unsure where I'd start, TBH), but I do think such a thing would be quite useful.

0gust1

@rek it was the case before AI thingies (content spinning/ blackhat SEO), but it became very extra worse lately!

R E K

@0gust1 i could navigate the mess well enough before, but the sheer volume of nonsensical garbage I have to wade through now makes me want to give up :/...

0gust1

@rek the worse (?) is that search companies (Hoogle) are literally mid-term digging their grave by allowing this kind of websites in their search indexes.

ex_06

@rek I’ve done the same in the past when I wanted to use less Google. Would be nice to have a small search engine to which you say “crawl these stuff” and it just crawls those and search only there from now on. Or maybe the crawler would be fine hosted by a community and then you download the subset of the sites you wanted… But I still can’t code for now e_e

Tanquist

@rek
I just made a list of all of the #AI text and image sites I could find so I can block them from my #kagi searches (I've also disabled kagi's #FastGPT service). I would be happy to share my list but I don't know how. I don't have a web site.

🇳🇱 Jeroen 🇺🇦 🇺🇦

@rek If anyone cares to revive dmoz.org I'd happily chip in.

Jonas

@rek
Thanks, the one about configuring Firefox is a good one to share

timthelion

@rek Seriously, it's almost as bad as 15th century herbaria.

Tendigits

@rek I bet the tiny bit of revenue one can make from filling web pages with adverts has been amplified with the ease of creating junk content from AI bots. People are loading the search engine indexes with a sea of trash, for profit.
I think you are smart to create your own library of knowledge and links.

R E K

@tendigits You gonna make a list? Or maybe you already have one? I would love to make a 'list of lists' haha~ everyone has different specializations/interests, and different levels of experience (to better weed out the nonsense) so it would be nice to consolidate that information.

Tendigits

@rek I do! I have an "other sites" page with very my-projects-centric bookmarks. :) tendigits.space/site/other-sit

Rafael

@rek @tendigits lists of lists :blobcatjustright:
My dream is a network like mastodon for sharing and searching lists. Like a web of trust for links to high quality information. I’m building that right now but it’s in the early stages still

Avi Bryant

@rek it’s so, so bad. And it’s just going to get worse.

elektron

@rek that reminds me of a small part of Neal Stephenson's novel Anathem. Where AI was spreading so much disinformation, another layer was needed on the world-wide network that took into account information source reputation for filtering the "inanity generation".

margot

@rek this has definitely been a big motivation to me to try and collect as many things offline resources as possible for reference, although it sucks for anything that requires more recent info

Dakedres

@rek I guess we're discovering that the library of babel started with actually coherent texts.

Mike

@rek Yeah, time to start compiling the lists and sharing them out blogroll-style.

My site (and the posts within) may not represent 100% accurate info 😅 but it *is* 100% human-generated (me as the human).

vacuumbeef

@rek I wanted to search some ideas for DIY shower curtain rod. And well I found something, but it was on some sports and workout website, the article had photos of people working out in the gym but the text was really about curtain rods (though obviously generated nonsense). At first I'm like WTF, okay generated but why on sport website?

But then I got it - barbell rod.

Go Up