mirror of
https://github.com/ai-robots-txt/ai.robots.txt.git
synced 2025-04-05 11:27:45 +00:00
fix: resolve conflict
This commit is contained in:
commit
429336d725
2 changed files with 8 additions and 2 deletions
|
@ -1,7 +1,13 @@
|
|||
# AI robots.txt
|
||||
# ai.robots.txt
|
||||
|
||||
<img src="/assets/images/noai-logo.png" width="100" />
|
||||
|
||||
**[Subscribe to updates via RSS/Atom by clicking on this link.](https://github.com/ai-robots-txt/ai.robots.txt/releases.atom)**
|
||||
|
||||
_(Or paste the link into your preferred feed reader.)_
|
||||
|
||||
---
|
||||
|
||||
This is an open list of web crawlers associated with AI companies and the training of LLMs to block. We encourage you to contribute to and implement this list on your own site.
|
||||
|
||||
A number of these crawlers have been sourced from [Dark Visitors](https://darkvisitors.com) and we appreciate the ongoing effort they put in to track these crawlers.
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
| AdsBot-Google | Google | Yes (Exceptions for Dynamic Search Ads) | Analyzes website content for ad relevancy, improves ad serving for Google Ads. Data anonymized according to [Google's Privacy Policy](https://policies.google.com/privacy). Unclear on data retention or use by other products. | Varies depending on campaign activity and website updates. Crawls optimized to minimize impact, specific frequency not public. | Web crawler by Google Ads to analyze websites for ad effectiveness and ensure ad relevancy to webpage content. |
|
||||
|Amazonbot | Amazon | Yes | Service improvement and enabling answers for Alexa users. | No information provided. | Includes references to crawled website when surfacing answers via Alexa; does not clearly outline other uses. |
|
||||
|anthropic-ai | [Anthropic](https://www.anthropic.com) | Unclear at this time. | Scrapes data to train Anthropic's AI products. | No information provided. | Scrapes data to train LLMs and AI products offered by Anthropic. |
|
||||
|Applebot-Extended | | | | | |
|
||||
|Applebot-Extended | [Apple](https://support.apple.com/en-us/119829#datausage) | Yes | | | Apple has a secondary user agent, Applebot-Extended ... [that is] used to train Apple’s foundation models powering generative AI features across Apple products, including Apple Intelligence, Services, and Developer Tools. |
|
||||
|AwarioRssBot | | | | | |
|
||||
|AwarioSmartBot | | | | | |
|
||||
|Bytespider | ByteDance | No | LLM training. | Unclear at this time. | Downloads data to train LLMS, including ChatGPT competitors. |
|
||||
|
|
Loading…
Reference in a new issue