ai.robots.txt/table-of-bot-metrics.md
2024-04-10 10:39:34 -07:00

3.7 KiB

Name Operator Respects robots.txt Data use Visit regularity Description
AdsBot-Google Google Yes (Exceptions for Dynamic Search Ads) Analyzes website content for ad relevancy, improves ad serving for Google Ads. Data anonymized according to Google's Privacy Policy (https://policies.google.com/privacy?hl=en-US). Unclear on data retention or use by other products. Varies depending on campaign activity and website updates. Crawls optimized to minimize impact, specific frequency not public. Web crawler by Google Ads to analyze websites for ad effectiveness and ensure ad relevancy to webpage content.
Amazonbot Amazon Yes Service improvement and enabling answers for Alexa users. No information provided. Includes references to crawled website when surfacing answers via Alexa; does not clearly outline other uses.
anthropic-ai Anthropic Unclear at this time Obtains training data for Anthropic's AI products. No information provided. Scrapes data to train LLMs and AI products offered by Anthropic.
Applebot Apple Yes Indexes sites to provide answers and search results for Siri users. Irregular and may be prompted by user queries. Used to answer queries from users; may included references to the indexed site.
AwarioRssBot
AwarioSmartBot
Bytespider
CCBot
ChatGPT-User OpenAI Yes Takes action based on user prompts. Only when prompted by a user. Used by plugins in ChatGPT to answer queries based on user input.
ClaudeBot
Claude-Web
coher-ai
DataForSeoBot
FacebookBot
Google-Extended
GoogleOther
GPTBot
ImagesiftBot
magpie-crawler
Meltwater
omgili
omgilibot
peer39_crawler
peer39_crawler/1.0
PerplexityBot
PiplBot
Seekr
YouBot