A list of AI agents and robots to block.
Find a file
2024-04-22 09:27:41 -07:00
.github Create FUNDING.yml 2024-04-12 11:40:44 -07:00
assets/images chore: add gitignore; move logo file 2024-04-10 10:44:07 -07:00
edge-functions chore: comment 2024-04-14 10:04:25 -07:00
servers chore: add img2dataset to apache.conf 2024-04-22 09:27:41 -07:00
.gitignore chore: add gitignore; move logo file 2024-04-10 10:44:07 -07:00
ai.txt chore: add ai.txt 2024-04-01 09:56:10 -07:00
LICENSE Initial commit 2024-03-27 10:48:29 -07:00
README.md chore: move additional resources to wiki 2024-04-12 09:33:19 -07:00
robots.txt chore: add img2dataset to robots.txt 2024-04-22 09:26:59 -07:00
table-of-bot-metrics.md chore: populate gptbot 2024-04-10 10:53:16 -07:00

AI robots.txt

This is an open list of web crawlers associated with AI companies and the training of LLMs to block. We encourage you to contribute to and implement this list on your own site.

A number of these crawlers have been sourced from Dark Visitors and we appreciate the ongoing effort they put in to track these crawlers.

If you'd like to add information about a crawler to the list, please make a pull request with the bot name added to robots.txt, ai.txt, and any relevant details in table-of-bot-metrics.md to help people understand what's crawling.


Thank you to Glyn for pushing me to set this up after I posted about blocking these crawlers.