@dansup Bayesian anti-spam is an amazing tool. (Warning, I do this for a living.)

Paul Graham was the first to successfully implement it. Others had come first but didn't include metadata (email headers). His followup work, Better Bayesian Filtering, talks about optimizing tokenization, which is definitely worth a look even 20y later, especially since there's less metadata in Fediverse than in email. (CRM114, however, is sadly not worth a look imho.)