mirror of
https://github.com/ai-robots-txt/ai.robots.txt.git
synced 2025-04-04 19:13:57 +00:00
chore: additional resources
This commit is contained in:
parent
5e02ebc168
commit
d7064d23fe
1 changed files with 20 additions and 0 deletions
20
README.md
20
README.md
|
@ -6,6 +6,26 @@ This is an open list of web crawlers associated with AI companies and the traini
|
|||
|
||||
A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers.
|
||||
|
||||
---
|
||||
|
||||
## Additional resources
|
||||
|
||||
**Spawning.ai**
|
||||
[Create an ai.txt](https://spawning.ai/ai-txt#create): an additional avenue to block crawlers. Example file:
|
||||
|
||||
```text
|
||||
# Spawning AI
|
||||
# Prevent datasets from using the following file types
|
||||
|
||||
User-Agent: *
|
||||
Disallow: /
|
||||
Disallow: *
|
||||
```
|
||||
|
||||
**[Have I Been Trained?](https://haveibeentrained.com/)**
|
||||
Search datasets for your content and request its removal.
|
||||
|
||||
|
||||
---
|
||||
|
||||
Thank you to [Glyn](https://github.com/glyn) for pushing [me](https://coryd.dev) to set this up after [I posted about blocking these crawlers](https://coryd.dev/posts/2024/go-ahead-and-block-ai-web-crawlers/).
|
||||
|
|
Loading…
Reference in a new issue