diff --git a/README.md b/README.md index b155f90..3d45bc0 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,26 @@ This is an open list of web crawlers associated with AI companies and the traini A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers. +--- + +## Additional resources + +**Spawning.ai** +[Create an ai.txt](https://spawning.ai/ai-txt#create): an additional avenue to block crawlers. Example file: + +```text +# Spawning AI +# Prevent datasets from using the following file types + +User-Agent: * +Disallow: / +Disallow: * +``` + +**[Have I Been Trained?](https://haveibeentrained.com/)** +Search datasets for your content and request its removal. + + --- Thank you to [Glyn](https://github.com/glyn) for pushing [me](https://coryd.dev) to set this up after [I posted about blocking these crawlers](https://coryd.dev/posts/2024/go-ahead-and-block-ai-web-crawlers/).