chore: add bytespider to table

2025-05-22 09:53:10 +00:00 · 2024-05-06 10:34:43 -07:00 · 2024-05-06 10:34:43 -07:00 · ec8e10a56c
commit ec8e10a56c
parent 262fe71ad9
1 changed files with 3 additions and 3 deletions
--- a/table-of-bot-metrics.md
+++ b/table-of-bot-metrics.md
@ -2,12 +2,12 @@
 |----------------|---------|-----------------------|----------|------------------|-------------|
 | AdsBot-Google   | Google  | Yes (Exceptions for Dynamic Search Ads) | Analyzes website content for ad relevancy, improves ad serving for Google Ads. Data anonymized according to Google's Privacy Policy (https://policies.google.com/privacy?hl=en-US). Unclear on data retention or use by other products. | Varies depending on campaign activity and website updates. Crawls optimized to minimize impact, specific frequency not public. | Web crawler by Google Ads to analyze websites for ad effectiveness and ensure ad relevancy to webpage content. |
 |Amazonbot      | Amazon | Yes | Service improvement and enabling answers for Alexa users. | No information provided. | Includes references to crawled website when surfacing answers via Alexa; does not clearly outline other uses. |
-|anthropic-ai  | [Anthropic](https://www.anthropic.com) | Unclear at this time | Scrapes data to train Anthropic's AI products. | No information provided. | Scrapes data to train LLMs and AI products offered by Anthropic. |
+|anthropic-ai  | [Anthropic](https://www.anthropic.com) | Unclear at this time. | Scrapes data to train Anthropic's AI products. | No information provided. | Scrapes data to train LLMs and AI products offered by Anthropic. |
 |Applebot      | Apple         | Yes | Indexes sites to provide answers and search results for Siri users. | Irregular and may be prompted by user queries. | Used to answer queries from users; may included references to the indexed site. |
 |AwarioRssBot   |         |                       |          |                  |             |
 |AwarioSmartBot |         |                       |          |                  |             |
-|Bytespider    |         |                       |          |                  |             |
-|CCBot         | [Common Crawl](https://commoncrawl.org) | [Yes](https://commoncrawl.org/ccbot) | Provides crawl data for an open source repository that has been used to train LLMs. | Unclear at this tiem. | Sources data that is made openly available and is used to train AI models. |
+|Bytespider    | ByteDance | No | LLM training. | Unclear at this time. | Downloads data to train LLMS, including ChatGPT competitors. |
+|CCBot         | [Common Crawl](https://commoncrawl.org) | [Yes](https://commoncrawl.org/ccbot) | Provides crawl data for an open source repository that has been used to train LLMs. | Unclear at this time. | Sources data that is made openly available and is used to train AI models. |
 |ChatGPT-User   | [OpenAI](https://openai.com) | Yes | Takes action based on user prompts. | Only when prompted by a user. | Used by plugins in ChatGPT to answer queries based on user input. |
 |ClaudeBot      |         |                       |          |                  |             |
 |Claude-Web    |         |                       |          |                  |             |