Commit graph

42 commits

Author SHA1 Message Date
Katrin Leinweber
a40cdcbe1e
Merge d79ca19f38 into c249de99a3 2025-03-28 06:00:41 +01:00
dark-visitors
c249de99a3 Update from Dark Visitors 2025-03-28 00:54:28 +00:00
Katrin Leinweber
d79ca19f38 Add Lightpanda due to its AI/LLM focus
https://github.com/lightpanda-io/browser
2025-03-27 17:59:09 +01:00
deyigifts
6ecfcdfcbf
Update perplexity bot
Update based on perplexity bot docs
2025-03-24 14:16:57 +08:00
dark-visitors
abfd6dfcd1 Update from Dark Visitors 2025-02-17 00:53:32 +00:00
a9ec4ffa6f
chore: add Brightbot 1.0 2025-02-16 13:36:39 -08:00
dark-visitors
bebffccc0c Update from Dark Visitors 2025-02-02 00:52:50 +00:00
nisbet-hubbard
05b79b8a58
Update robots.json 2025-01-27 19:41:03 +08:00
dark-visitors
9c060dee1c Update from Dark Visitors 2025-01-21 00:49:22 +00:00
Joshua Sheard
7427d96bac
Update robots.json
Co-authored-by: Glyn Normington <work@underlap.org>
2025-01-20 10:59:02 +00:00
Joshua Sheard
5aa08bc002
Add Crawlspace 2025-01-19 22:03:50 +00:00
Jordan Atwood
143f8f2285
Block SemrushBot 2025-01-06 12:34:38 -08:00
dark-visitors
2036a68c1f Update from Dark Visitors 2024-12-04 00:55:50 +00:00
dark-visitors
37065f9118 Update from Dark Visitors 2024-11-24 00:57:05 +00:00
Glyn Normington
80002f5e17 Allow facebookexternalhit
At the time of writing, this crawler does not
appear to be for the purpose of AI.

See: https://developers.facebook.com/docs/sharing/webmasters/web-crawlers/
(accessed on 19 November 2024).

Fixes https://github.com/ai-robots-txt/ai.robots.txt/issues/40
2024-11-19 03:33:45 +00:00
dark-visitors
bc0a0ad0e9 Update from Dark Visitors 2024-10-29 00:52:12 +00:00
dark-visitors
fe5f407673 Update from Dark Visitors 2024-10-27 00:54:47 +00:00
Glyn Normington
38a388097c Fix typo and trigger rerun of main job 2024-10-19 04:42:27 +01:00
dark-visitors
faf81efb12 Daily update from Dark Visitors 2024-10-19 01:17:15 +00:00
dark-visitors
b1491d2694 Daily update from Dark Visitors 2024-10-09 01:17:37 +00:00
Laker Turner
dc15afe847
Update robots.json with Claude respect link 2024-10-07 17:38:01 +01:00
9c2394f23b
chore: add ISSCyberRiskCrawler 2024-09-30 16:25:20 -07:00
6a988be27f
chore: add sidetrade bot 2024-09-28 13:58:00 -07:00
dark-visitors
7851cea4fd Daily update from Dark Visitors 2024-09-27 01:18:04 +00:00
Greg Lindahl
a6de89e6bd feat: make CCBot entry more accurate 2024-09-26 21:41:28 +00:00
dark-visitors
5963cbf9f7 Daily update from Dark Visitors 2024-09-08 01:19:31 +00:00
8373294404
chore: add iaskspider/2.0 2024-09-06 19:05:26 -07:00
0f8723558f
chore: add ai2bot 2024-08-28 20:07:32 -07:00
dark-visitors
7bfc1647a8 Daily update from Dark Visitors 2024-08-22 01:11:43 +00:00
dark-visitors
5937434aff Daily update from Dark Visitors 2024-08-15 01:07:15 +00:00
dark-visitors
6a275366be Daily update from Dark Visitors 2024-08-07 10:50:45 +00:00
Chenghao Mou
944bee0f56 call main after update 2024-08-07 11:31:58 +01:00
dark-visitors
cebf809391 Daily update from Dark Visitors 2024-08-07 00:14:26 +00:00
Chenghao Mou
4cf82b703f restore original robots.json 2024-08-06 19:50:38 +01:00
dark-visitors
63c7e742c3 Daily update from Dark Visitors 2024-08-06 16:54:29 +00:00
dark-visitors
fdd261dad4 Daily update from Dark Visitors 2024-08-06 16:27:02 +00:00
Joshua Sheard
146fd4ffba
Fix Imagesift user agent 2024-08-04 21:33:04 +01:00
1ca936ce11
chore: restore FriendlyCrawler + ImageSift 2024-08-04 12:28:48 -07:00
Mirium999
5826c18909 Add ICC-Crawler 2024-08-04 10:11:25 +09:00
b20dfec1e4
chore: drop in additional data 2024-08-01 15:33:07 -07:00
efabf3e721
chore: remove test data 2024-08-01 15:25:55 -07:00
Adam Newbold
1fdc79dacb Adding GitHub Action 2024-08-01 18:17:19 -04:00