@leah Would it be possible to deliver very bad content (that still makes sense), instead of the actual content of your website, when the IP belongs to a bot? The training set will be dirty, the resulting model will be catastrophic, and you might be blacklisted by AI-tech companies.
(it's not a true solution – but long term, actually it is. The training set is only as good as the trust you can put in it, and if they can't trust the content, what can they do with it?)
@leah @ohne_sonne I very much like the idea of somehow Rickrolling all of the AI crawlers!