5 comments
@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft I'm glad you think so. ;) @amin @5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft You down rank pages with ads and trackers? If only this was more common... @azonenberg @5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft Right??? XD But all the mainstream search engines are mostly by companies that sell advertising and tracking services so it's not likely in them. I've found it really effective at fighting SEO, though; if people are trying to hack the system to get you on their site, they probably have ads or tracking. ;) |
@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft
Yep!
It's very similar to Clew in goals (promoting personal, non-commercial websites) and even uses the same ranking function at heart (BM25F) but I did make a number of changes in methodology, for example:
- Most of my webpage discovery is centered around RSS feeds (which is both a great mature technology and means sites with RSS feeds [often personal sites] are gonna be better-treated by the crawler)
- Marginalia still indexes big sites like Wikipedia and StackExchange while I specifically blacklist them from the crawler (helps emphasize small sites and saves significant resources for the crawler; I may do some kind of integration in the future but for now I have bangs if you wanna search them)
- Marginalia does warn about javascript, ads, etc., but I don't think it affects pages' rankings, while I penalize ads and trackers
- I'm really proud of my brand new page weight indicators, which I haven't seen anything like in other search engines before. :)
All that said Clew is definitely still very beta. XD
@5ciFiGirl @ErictheCerise @Pajo_16 @nixCraft
Yep!
It's very similar to Clew in goals (promoting personal, non-commercial websites) and even uses the same ranking function at heart (BM25F) but I did make a number of changes in methodology, for example:
- Most of my webpage discovery is centered around RSS feeds (which is both a great mature technology and means sites with RSS feeds [often personal sites] are gonna be better-treated by the crawler)
- Marginalia still indexes big sites like Wikipedia...