A list of AI agents and robots to block.
Find a file
ai.robots.txt 5b8650b99b
Some checks failed
/ run-tests (push) Has been cancelled
Update from Dark Visitors
2025-03-29 00:54:10 +00:00
.github/workflows Add workflow to run tests on pull request or push to main 2025-02-18 20:30:27 +01:00
assets/images chore: remove unused image 2024-06-22 12:44:02 -07:00
code Add tests for Nginx config generator 2025-03-27 18:28:21 +01:00
.gitignore simplify repo and added some tests 2024-10-19 13:06:34 +02:00
.htaccess Update from Dark Visitors 2025-03-29 00:54:10 +00:00
FAQ.md Improve formatting 2024-11-10 01:06:13 +00:00
LICENSE Initial commit 2024-03-27 10:48:29 -07:00
nginx-block-ai-bots.conf Merge pull request #94 from ThomasLeister/feature/implement-nginx-configuration-snippet-export 2025-03-27 19:49:15 +00:00
README.md Mention Nginx config feature in README 2025-03-27 12:43:29 +01:00
robots.json Update from Dark Visitors 2025-03-28 00:54:28 +00:00
robots.txt Update from Dark Visitors 2025-03-29 00:54:10 +00:00
table-of-bot-metrics.md Update from Dark Visitors 2025-03-29 00:54:10 +00:00

ai.robots.txt

This is an open list of web crawlers associated with AI companies and the training of LLMs to block. We encourage you to contribute to and implement this list on your own site. See information about the listed crawlers and the FAQ.

A number of these crawlers have been sourced from Dark Visitors and we appreciate the ongoing effort they put in to track these crawlers.

If you'd like to add information about a crawler to the list, please make a pull request with the bot name added to robots.txt, ai.txt, and any relevant details in table-of-bot-metrics.md to help people understand what's crawling.

Usage

This repository provides the following files:

  • robots.txt
  • .htaccess
  • nginx-block-ai-bots.conf

robots.txt implements the Robots Exclusion Protocol (RFC 9309).

.htaccess may be used to configure web servers such as Apache httpd to return an error page when one of the listed AI crawlers sends a request to the web server. Note that, as stated in the httpd documentation, more performant methods than an .htaccess file exist.

nginx-block-ai-bots.conf implements a Nginx configuration snippet that can be included in any virtual host server {} block via the include directive.

Contributing

A note about contributing: updates should be added/made to robots.json. A GitHub action will then generate the updated robots.txt, table-of-bot-metrics.md, .htaccess and nginx-block-ai-bots.conf.

You can run the tests by installing Python 3 and issuing:

code/tests.py

Subscribe to updates

You can subscribe to list updates via RSS/Atom with the releases feed:

https://github.com/ai-robots-txt/ai.robots.txt/releases.atom

You can subscribe with Feedly, Inoreader, The Old Reader, Feedbin, or any other reader app.

Alternatively, you can also subscribe to new releases with your GitHub account by clicking the ⬇️ on "Watch" button at the top of this page, clicking "Custom" and selecting "Releases".

Report abusive crawlers

If you use Cloudflare's hard block alongside this list, you can report abusive crawlers that don't respect robots.txt here. But even if you don't use Cloudflare's hard block, their list of verified bots may come in handy.

Additional resources