Compare commits

...

2 commits

Author SHA1 Message Date
Glyn Normington
9a9b1b41c0
Merge pull request #119 from ai-robots-txt/bing-ai-opt-out-instructions
Some checks are pending
/ run-tests (push) Waiting to run
Bing AI opt-out instructions
2025-05-14 19:18:20 +01:00
36a52a88d8
Bing AI opt-out instructions 2025-05-12 20:20:18 -07:00
2 changed files with 38 additions and 0 deletions

View file

@ -35,6 +35,8 @@ Note that, as stated in the [httpd documentation](https://httpd.apache.org/docs/
``` ```
(Note that the path of the `haproxy-block-ai-bots.txt` may be different in your environment.) (Note that the path of the `haproxy-block-ai-bots.txt` may be different in your environment.)
[Bing uses the data it crawls for AI and training, you may opt out by adding a `meta` tag to the `head` of your site.]((./docs/additional-steps/bing.md))
## Contributing ## Contributing
A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`. A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`.

View file

@ -0,0 +1,36 @@
# Bing (bingbot)
It's not well publicised, but Bing uses the data it crawls for AI and training.
However, the current thinking is, blocking a search engine of this size using `robots.txt` seems a quite drastic approach as it is second only to Google and could significantly impact your website in search results.
Additionally, Bing powers a number of search engines such as Yahoo and AOL, and its search results are also used in Duck Duck Go, amongst others.
Fortunately, Bing supports a relatively simple opt-out method, requiring an additional step.
## How to opt-out of AI training
You must add a metatag in the `<head>` of your webpage. This also needs to be added to every page on your website.
The line you need to add is:
```plaintext
<meta name="robots" content="noarchive">
```
By adding this line, you are signifying to Bing: "Do not use the content for training Microsoft's generative AI foundation models."
## Will my site be negatively affected
Simple answer, no.
The original use of "noarchive" has been retired by all search engines. Google retired its use in 2024.
The use of this metatag will not impact your site in search engines or in any other meaningful way if you add it to your page(s).
It is now solely used by a handful of crawlers, such as Bingbot and Amazonbot, to signify to them not to use your data for AI/training.
## Resources
Bing Blog AI opt-out announcement: https://blogs.bing.com/webmaster/september-2023/Announcing-new-options-for-webmasters-to-control-usage-of-their-content-in-Bing-Chat
Bing metatag information, including AI opt-out: https://www.bing.com/webmasters/help/which-robots-metatags-does-bing-support-5198d240