From 261a2b83b90fe89f1d842066709c019fd1dba30f Mon Sep 17 00:00:00 2001 From: always-be-testing Date: Fri, 14 Feb 2025 12:26:19 -0500 Subject: [PATCH 1/5] update README to inclide list of ai bots Cloudflare considers verified --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index 065b0b7..6758570 100644 --- a/README.md +++ b/README.md @@ -40,6 +40,19 @@ Alternatively, you can also subscribe to new releases with your GitHub account b If you use [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) alongside this list, you can report abusive crawlers that don't respect `robots.txt` [here](https://docs.google.com/forms/d/e/1FAIpQLScbUZ2vlNSdcsb8LyTeSF7uLzQI96s0BKGoJ6wQ6ocUFNOKEg/viewform). + +If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that make use of [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots) conditions, please note that the following AI web crawlers are considered verified bots by Cloudflare: +- Amazonbot +- Applebot +- CCBot +- ChatGPT-User +- DuckAssistBot +- GoogleOther +- GPTBot +- OAI-SearchBot +- PerplexityBot +- PetalBot + ## Additional resources - [Blocking Bots with Nginx](https://rknight.me/blog/blocking-bots-with-nginx/) by Robb Knight From e396a2ec781095c5e2659eefb99c46ab7715a664 Mon Sep 17 00:00:00 2001 From: always-be-testing Date: Fri, 14 Feb 2025 12:31:20 -0500 Subject: [PATCH 2/5] forgot to include heading --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6758570..e70d283 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,7 @@ Alternatively, you can also subscribe to new releases with your GitHub account b If you use [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) alongside this list, you can report abusive crawlers that don't respect `robots.txt` [here](https://docs.google.com/forms/d/e/1FAIpQLScbUZ2vlNSdcsb8LyTeSF7uLzQI96s0BKGoJ6wQ6ocUFNOKEg/viewform). - +## Cloudflare Verified Bots If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that make use of [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots) conditions, please note that the following AI web crawlers are considered verified bots by Cloudflare: - Amazonbot - Applebot From f99339922fa9afdbb00e18bb99105e81cd3f8e88 Mon Sep 17 00:00:00 2001 From: always-be-testing Date: Fri, 14 Feb 2025 12:36:33 -0500 Subject: [PATCH 3/5] grammar update and include syntax for verified bot condition --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e70d283..f471ede 100644 --- a/README.md +++ b/README.md @@ -41,7 +41,7 @@ Alternatively, you can also subscribe to new releases with your GitHub account b If you use [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) alongside this list, you can report abusive crawlers that don't respect `robots.txt` [here](https://docs.google.com/forms/d/e/1FAIpQLScbUZ2vlNSdcsb8LyTeSF7uLzQI96s0BKGoJ6wQ6ocUFNOKEg/viewform). ## Cloudflare Verified Bots -If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that make use of [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots) conditions, please note that the following AI web crawlers are considered verified bots by Cloudflare: +If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that use the `cf.bot_management.verified_bot` condition based on [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots), please note that the following AI web crawlers are considered verified bots by Cloudflare: - Amazonbot - Applebot - CCBot From af87b85d7f00bc285cb414280e02d2f42284a9d8 Mon Sep 17 00:00:00 2001 From: always-be-testing Date: Fri, 14 Feb 2025 12:39:08 -0500 Subject: [PATCH 4/5] include return after heading --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index f471ede..303f009 100644 --- a/README.md +++ b/README.md @@ -41,6 +41,7 @@ Alternatively, you can also subscribe to new releases with your GitHub account b If you use [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) alongside this list, you can report abusive crawlers that don't respect `robots.txt` [here](https://docs.google.com/forms/d/e/1FAIpQLScbUZ2vlNSdcsb8LyTeSF7uLzQI96s0BKGoJ6wQ6ocUFNOKEg/viewform). ## Cloudflare Verified Bots + If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that use the `cf.bot_management.verified_bot` condition based on [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots), please note that the following AI web crawlers are considered verified bots by Cloudflare: - Amazonbot - Applebot From 5b13c2e504c843c2a95981cee1c2655d9f21c8f4 Mon Sep 17 00:00:00 2001 From: always-be-testing Date: Sat, 15 Feb 2025 11:22:10 -0500 Subject: [PATCH 5/5] add more concise message about verified bots Co-authored-by: Glyn Normington --- README.md | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) diff --git a/README.md b/README.md index 303f009..a206c83 100644 --- a/README.md +++ b/README.md @@ -39,21 +39,7 @@ Alternatively, you can also subscribe to new releases with your GitHub account b ## Report abusive crawlers If you use [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) alongside this list, you can report abusive crawlers that don't respect `robots.txt` [here](https://docs.google.com/forms/d/e/1FAIpQLScbUZ2vlNSdcsb8LyTeSF7uLzQI96s0BKGoJ6wQ6ocUFNOKEg/viewform). - -## Cloudflare Verified Bots - -If you are unable to make use of [Cloudflare's hard block](https://blog.cloudflare.com/declaring-your-aindependence-block-ai-bots-scrapers-and-crawlers-with-a-single-click) and/or have WAF rules that use the `cf.bot_management.verified_bot` condition based on [Cloudflare's Verified Bots](https://radar.cloudflare.com/traffic/verified-bots), please note that the following AI web crawlers are considered verified bots by Cloudflare: -- Amazonbot -- Applebot -- CCBot -- ChatGPT-User -- DuckAssistBot -- GoogleOther -- GPTBot -- OAI-SearchBot -- PerplexityBot -- PetalBot - +But even if you don't use Cloudflare's hard block, their list of [verified bots](https://radar.cloudflare.com/traffic/verified-bots) may come in handy. ## Additional resources - [Blocking Bots with Nginx](https://rknight.me/blog/blocking-bots-with-nginx/) by Robb Knight