You might have guessed it already: We are struggling...

Codeberg.org's posts Post Back to profile

You might have guessed it already: We are struggling with excessive crawling today. We have - again - blocked several large IP ranges, but were not yet able to identify the new actor.

We are working on restoring service availability and fine-tuning our rate-limiting.

If someone is interested in implementing an improved native rate-limiting in #Forgejo that also protects other instances from abusive crawlers, please reach out 😉

Like 14 October at 16:44 | Open on social.anoxinon.de

14 comments

Rev. Roger BW 😷

@Codeberg We block all Azure now. Users whine but if they want good service they shouldn't be using an address in the bad part of Internet Town.

14 October at 17:14 | Open on discordian.social

Felipe M.

@Codeberg Just wondering what you folks use in front of forgejo. I experience abusive crawling as well but my instance is a small personal one on my homelab, so it's really annoying be losing bandwidth to abusive actors. Considering any self-hosatble WAF in front of my homelab services.

14 October at 17:17 | Open on fosstodon.org

Codeberg.org

@fmartingr We're using haproxy and have a custom blacklist loaded here: https://codeberg.org/Codeberg-Infrastructure/scripted-configuration/src/commit/bef038ca91cb928e0b865ada4bc6d579b2bc857e/hosts/kampenwand/etc/haproxy/haproxy.cfg#L265

It's not public (yet), but we should probably consider opening it. Would need a check there are only publicly known IP addresses on there, though. I'm not fully up to date with how law considers publishing IP ranges of bad actors. ~f

14 October at 17:22 | Open on social.anoxinon.de

Felipe M.

@Codeberg I was considering trying something like CrowdSec, but unsure how they handle the bad IP ranges and what they consider "bad actors". If we could had something like that but with lists like adblockers do, maintained by the community, it could be nice. Will take a look :blobfoxeyes: Thanks and hope you resolve the issue soon!

14 October at 17:27 | Open on fosstodon.org

Codeberg.org

Does "Aceville" ring bells for anyone by chance? Related to tencent probably? We are blocking one IP range after another ...

14 October at 17:20 | Open on social.anoxinon.de

Daniel Böhmer

@Codeberg I won’t be able to provide an implementation but for better understanding: How does rate limiting work now and what kind of improvement would be helpful in your current scenario?

14 October at 17:30 | Open on ieji.de

Codeberg.org

@dboehmer One of the primary constraints of the current rate-limiting is that there is only a global counter that increases for each request.

So a user watching Forgejo Actions logs scroll through will fire a lot of small requests. And a botnet that is distributed over many many IP addresses do not trigger the rate-limiting at all, because each server only fires a few requests.

14 October at 19:26 | Open on social.anoxinon.de

Harald

@Codeberg @dboehmer Is there an issue for this, to allow a more focused discussion?

15 October at 10:19 | Open on nrw.social

Marcus Rohrmoser 🌻

@Codeberg as an emergency measure I'd prbly block non-authenticated http (except signup + login) altogether.

14 October at 18:22 | Open on digitalcourage.social

Simon

@Codeberg I know what you feel... Same on gitnet.fr.

if ($http_user_agent ~* "facebookexternalhit|bytespider|Amazonbot|ClaudeBot|AhrefsBot") { return 429; }

15 October at 18:38 | Open on mamot.fr

Rachel Rawlings

@Codeberg Any chance you put some honeypot paths in robots.txt that trigger fail2ban against any requestors?

15 October at 19:49 | Open on mastodon.social

Torsten Grote

@Codeberg
Note that @fdroidorg can't build apps hosted on codeberg anymore due to this. Its buildserver clones the repo for each app and soon gets 429.

19 October at 14:32 | Open on chaos.social

Codeberg.org

@grote
This is interesting feedback. There have been no changes to the rate-limiting, and the last two changes over the past three months were always increases.

We have blocked several offending IP ranges. Is there information about which hosting providers Fdroid uses?
@fdroidorg

19 October at 17:53 | Open on social.anoxinon.de

Hans-Christoph Steiner

@Codeberg @grote @fdroidorg where are production buildservers are located is not public information, and it is not necessarily static. But I imagine it would be easy to figure out which IP address by looking at the logs on the codeberg side. We haven't been blocked before by any other git/scm hoster, to my knowledge.

23 October at 7:03 | Open on social.librem.one