Email or username:

Password:

Forgot your password?
Rob Hafner :verified_flashing:

If you're an admin of a mastodon/fediverse instance you should update your robots.txt to block "GPTBot", the crawler made by OpenAI to feed their machine learning models such as ChatGPT.

This is the easiest way right now to prevent public content from being crawled and fed into their datasets, and due to the nature of federation it works better the more instances that do it.

platform.openai.com/docs/gptbo

4 comments
i am root

@tedivm Do you happen to know if a blanket disallow like:

User-agent: *
Disallow: /

Will be honored by GPTBot? I wouldn't put it past them to ignore the root disallow and require a specific `User-agent: GPTBot`, but it's not called out in the doc either way.

There's some inconclusive evidence in my Nginx access logs that they crawled my robots.txt, then proceeded to crawl other URLs. But I only see 6 hits total, and they all happened within 10 seconds. 🤷‍♂️

Rob Hafner :verified_flashing:

@null it should be- I'd be interested if anyone has examples of it ignoring those

Go Up