Merge branch 'main' into main

This commit is contained in:
Crazyroostereye 2025-04-30 22:13:51 +02:00 committed by GitHub
commit 6baa7725b3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 160 additions and 7 deletions

View file

@ -15,6 +15,7 @@ This repository provides the following files:
- `.htaccess`
- `nginx-block-ai-bots.conf`
- `Caddyfile`
- `haproxy-block-ai-bots.txt`
`robots.txt` implements the Robots Exclusion Protocol ([RFC 9309](https://www.rfc-editor.org/rfc/rfc9309.html)).
@ -25,6 +26,15 @@ Note that, as stated in the [httpd documentation](https://httpd.apache.org/docs/
`Caddyfile` includes a Header Regex matcher group you can copy or import into your Caddyfile, the rejection can then be handeld as followed `abort @aibots`
`haproxy-block-ai-bots.txt` may be used to configure HAProxy to block AI bots. To implement it;
1. Add the file to the config directory of HAProxy
2. Add the following lines in the `frontend` section;
```
acl ai_robot hdr_sub(user-agent) -i -f /etc/haproxy/haproxy-block-ai-bots.txt
http-request deny if ai_robot
```
(Note that the path of the `haproxy-block-ai-bots.txt` may be different in your environment.)
## Contributing
A note about contributing: updates should be added/made to `robots.json`. A GitHub action will then generate the updated `robots.txt`, `table-of-bot-metrics.md`, `.htaccess` and `nginx-block-ai-bots.conf`.