Email or username:

Password:

Forgot your password?
Top-level
Pajo_16

@nixCraft Is there any way to find these sites? You are correct in that they disappeared but there must be a way to find them. Hopefully 🤞

34 comments
Paul Sutton

@Pajo_16 @nixCraft

If people find them, perhaps post links here.

Beam me out!

@fury999io @Pajo_16 Thanks! I wanted to recommend this, but realized I didn't bookmark it and forgot the url.

DELETED

@fury999io @Pajo_16 aren't there some general open web directories somewhere too?

sigi714

@Pajo_16 @nixCraft On page two. On page 5 you can find bad SEO optimizing agencies.

marcus 😷🧂

@Pajo_16 @nixCraft I've heard people saying that they're adding "before:2023" to their searches to filter out a lot of the ai garbage. Will only help as long as the search topic is somewhat timeless, obviously.

Pajo_16

@marcusdeh @nixCraft
Folks, thanks for all the suggestions. It's appreciated.

P Stewart

@marcusdeh @Pajo_16 @nixCraft Even that seems shaky these days - I'll try putting date restrictions on searches and regularly get stuff that the results page says is a week old, but actually dates from 2011. (Or vice versa.)

I'm not sure if they're just not respecting search syntax, something's breaking on the search engines' side of things, or if people are figuring out ways to make pages appear to be a different age than they actually are.

Christian Krebel

@Pajo_16 @nixCraft I can recommend the search engine #Kagi. They have a different index, one can put a weight on specific domains and they have a project called small web to randomly find those gems.

P4

@ChristianKrebel @Pajo_16 @nixCraft isn't Kagi in on the AI bullshit? I wouldn't trust them not to screw everyone over once they get popular enough.

Christian Krebel

@p4 @Pajo_16 @nixCraft well they have integrated AI, but more on action. Most of the time you will have to trigger an AI feature yourself. Also, they have an assistant where you can choose the models you want to use (APIs are more anonymous) and the answers will have sources from their index which is the best of both worlds IMHO.

Jeremy Yap

@Pajo_16 @nixCraft

If you're looking for blogs and other personal sites I recommend bookmarking these search engines:
- search.marginalia.nu/
- ichi.do/
- clew.se/
- searchmysite.net/
- wiby.me/

Would also recommend checking out this very excellent piece as well for alternative search options: seirdy.one/posts/2021/03/10/se

Third spruce tree on the left

@jeruyyap @Pajo_16 @nixCraft Support the micro-brew search engines that are trying to make Search Engines Not Suck Again!

(new slogan #SENSA)

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@jeruyyap @Pajo_16 @nixCraft

Happy to be mentioned! ;)

Clew is very beta at the moment but I've just started my summer break so I should have some good time for dedicated work on it. :)

Paul McBride

@jeruyyap @Pajo_16 @nixCraft Kagi has a great “small web” search filter too

Eric the Cerise

@Pajo_16

Learn how to search the 'Net w/o the mainstream engines.

Some of the best known alternatives are either getting bad, too (DuckDuckGo is pushing its own AI "solution"), others are "meta-search engines", which run your query thru Google, Bing, etc for you ... which is great for privacy but doesn't do much to improve the search quality.

Look at the various #SearXNG instances (searx.space/ ) ... those are meta-search services, but they hit a *lot* of primary engines.

Look for new engines, new projects, alternative solutions.

The #IndieWeb used to have a search engine indieweb.org/search ... looks like it might be dead now? But check it out anyway.

Check out wiby.me , an alternate search engine specifically for those old, alt, and independent web sites that the main engines are burying.

Search for search.

stract.com/ is a beta project, one-person show running on a server in the guy's basement, but it has promise.

clew.se is a *very* beta service, made by a kid who just graduated college last week (congrats @amin ! ), he is also trying to set up an algorithm that prefers alternate, home-made "real-people" websites.

There are many alternatives out there ... most are not "ready for prime time" and it's gonna be a learning curve to adjust.

Don't just use 'em. Promote 'em. Help them if you can. Many need non-tech help with documentation and translations and etc.

@nixCraft

@Pajo_16

Learn how to search the 'Net w/o the mainstream engines.

Some of the best known alternatives are either getting bad, too (DuckDuckGo is pushing its own AI "solution"), others are "meta-search engines", which run your query thru Google, Bing, etc for you ... which is great for privacy but doesn't do much to improve the search quality.

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@ErictheCerise @Pajo_16 @nixCraft

I, uh, didn't actually graduate college. I just finished my sophomore year. XD

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@ErictheCerise @Pajo_16 @nixCraft

Hahaha, no worries, this is a continuation of a long, long trend of people overestimating my age online; you're actually closer than most. The usual guess is that I'm in my 30s. XD

Working on a Communications degree with a Professional Writing minor. :)

Sci-Fi Girl

@ErictheCerise

Maybe add search.marginalia.nu/ to the list?

Their focus is on finding small, old and obscure websites. 😎

And I'll have to look at the ones in your list that are new to me!

@Pajo_16 @amin @nixCraft

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft

Yeah, Marginalia 1000% deserves the spot more than me. I took a ton of inspiration from their work and they've actually been around for significantly longer than this recent wave of launches. :)

Sci-Fi Girl

@amin

Cool! Having more options is definitely better!! 😎

@ErictheCerise @Pajo_16 @nixCraft

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft

Yep!

It's very similar to Clew in goals (promoting personal, non-commercial websites) and even uses the same ranking function at heart (BM25F) but I did make a number of changes in methodology, for example:

- Most of my webpage discovery is centered around RSS feeds (which is both a great mature technology and means sites with RSS feeds [often personal sites] are gonna be better-treated by the crawler)
- Marginalia still indexes big sites like Wikipedia and StackExchange while I specifically blacklist them from the crawler (helps emphasize small sites and saves significant resources for the crawler; I may do some kind of integration in the future but for now I have bangs if you wanna search them)
- Marginalia does warn about javascript, ads, etc., but I don't think it affects pages' rankings, while I penalize ads and trackers
- I'm really proud of my brand new page weight indicators, which I haven't seen anything like in other search engines before. :)

All that said Clew is definitely still very beta. XD

@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft

Yep!

It's very similar to Clew in goals (promoting personal, non-commercial websites) and even uses the same ranking function at heart (BM25F) but I did make a number of changes in methodology, for example:

- Most of my webpage discovery is centered around RSS feeds (which is both a great mature technology and means sites with RSS feeds [often personal sites] are gonna be better-treated by the crawler)
- Marginalia still indexes big sites like Wikipedia...

Andrew Zonenberg

@amin @5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft You down rank pages with ads and trackers? If only this was more common...

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@azonenberg @5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft

Right??? XD

But all the mainstream search engines are mostly by companies that sell advertising and tracking services so it's not likely in them.

I've found it really effective at fighting SEO, though; if people are trying to hack the system to get you on their site, they probably have ads or tracking. ;)

Eric the Cerise

@5ciFiGirl

No 'maybe' about it, that's an awesome one, and new for me. Check out his 'About' page ( marginalia.nu/marginalia-searc ) ... I want to have his babies.

@Pajo_16 @amin @nixCraft

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft

A ton of people link to his "The Small Website Discoverability Crisis" when justifying their own search engines (which is great) but I also find it kinda hilarious that they often don't seem to realize he has his own search engine too. XD

Amin Hollon 🇺🇸🇲🇾🇮🇳🇦🇫

@Pajo_16 @nixCraft

Blogrolls are a great option. :)

Mine's here if you want a good starting place: https://benjaminhollon.com/blogroll/

Then many of those sites have their own blogrolls; and so on and so on and so on.

Elias

@Pajo_16

> Is there any way to find these sites?

One alternative, independent search engine is #Mojeek that has its own index, using that you may be able to find things that Google/Microsoft decided to remove from their search results: mojeek.com/
@nixCraft

Go Up