Commit graph

125 commits

Author SHA1 Message Date
ai.robots.txt
b8e68c12f3 Daily update from Dark Visitors 2024-08-18 01:14:50 +00:00
ai.robots.txt
60ff792ba9 Removing previously generated files 2024-08-18 01:14:49 +00:00
ai.robots.txt
3afcefdff5 Daily update from Dark Visitors 2024-08-17 01:08:17 +00:00
ai.robots.txt
558d5871b2 Removing previously generated files 2024-08-17 01:08:17 +00:00
ai.robots.txt
2a075cb2f1 Daily update from Dark Visitors 2024-08-16 01:10:14 +00:00
ai.robots.txt
3ef9cb7ce4 Removing previously generated files 2024-08-16 01:10:13 +00:00
ai.robots.txt
df5b6ef647 Daily update from Dark Visitors 2024-08-14 01:11:03 +00:00
ai.robots.txt
2c8ed062b9 Removing previously generated files 2024-08-14 01:11:02 +00:00
ai.robots.txt
2e8e8af8e4 Daily update from Dark Visitors 2024-08-13 01:12:03 +00:00
ai.robots.txt
f1d0c5b1fe Removing previously generated files 2024-08-13 01:12:02 +00:00
ai.robots.txt
53a39b2f71 Daily update from Dark Visitors 2024-08-12 01:12:23 +00:00
ai.robots.txt
274d48b8f0 Removing previously generated files 2024-08-12 01:12:23 +00:00
ai.robots.txt
6472e07f09 Daily update from Dark Visitors 2024-08-11 01:16:04 +00:00
ai.robots.txt
cb98669cc2 Removing previously generated files 2024-08-11 01:16:03 +00:00
ai.robots.txt
53449ad1bd Daily update from Dark Visitors 2024-08-10 01:10:53 +00:00
ai.robots.txt
4242f8cc7b Removing previously generated files 2024-08-10 01:10:53 +00:00
ai.robots.txt
21e5cd96a9 Daily update from Dark Visitors 2024-08-09 01:11:12 +00:00
ai.robots.txt
ed7d7d3fdf Removing previously generated files 2024-08-09 01:11:11 +00:00
ai.robots.txt
57f006150b Daily update from Dark Visitors 2024-08-08 01:10:13 +00:00
ai.robots.txt
40f9325a4f Removing previously generated files 2024-08-08 01:10:12 +00:00
ai.robots.txt
0122dea1e9 Merge pull request #32 from ChenghaoMou/main
Tracking Dark Visitors Automatically
2024-08-07 22:40:24 +00:00
ai.robots.txt
663b85cc07 Removing previously generated files 2024-08-07 22:40:24 +00:00
ai.robots.txt
ab17662f96 Daily update from Dark Visitors 2024-08-07 11:41:00 +00:00
ai.robots.txt
8738c66c65 Removing previously generated files 2024-08-07 11:40:59 +00:00
Chenghao Mou
b00067bc86 restore files deleted by failed workflow and fix main commit message 2024-08-07 12:36:21 +01:00
ai.robots.txt
4a63c482c4 Removing previously generated files 2024-08-07 11:31:02 +00:00
Chenghao Mou
366e49dc6d restore files deleted by failed workflow and fix main commit message 2024-08-07 12:21:40 +01:00
ai.robots.txt
aaa55594e1 Removing previously generated files 2024-08-07 11:13:16 +00:00
Chenghao Mou
fbebbbfefb restore files deleted by failed workflow 2024-08-07 12:02:50 +01:00
ai.robots.txt
d4f34363ec Removing previously generated files 2024-08-07 10:40:50 +00:00
ai.robots.txt
30eaff1447 call main after update 2024-08-07 10:32:13 +00:00
ai.robots.txt
bd3eee7a30 Removing previously generated files 2024-08-07 10:32:12 +00:00
ai.robots.txt
3d4bf2c3db restore original robots.json 2024-08-06 18:50:54 +00:00
ai.robots.txt
d6a5e8cd81 Removing previously generated files 2024-08-06 18:50:53 +00:00
ai.robots.txt
e12ddc0f42 Merge pull request #29 from jbowdre/dev
only build on changes to robots.json
2024-08-06 15:44:54 +00:00
ai.robots.txt
b54e274bbc Removing previously generated files 2024-08-06 15:44:53 +00:00
ai.robots.txt
eb924b9856 Merge pull request #28 from jsheard/patch-2
Add Cloudflares first-party scraper blocking to FAQ
2024-08-04 21:54:17 +00:00
ai.robots.txt
1cfc071498 Removing previously generated files 2024-08-04 21:54:16 +00:00
ai.robots.txt
c2f177870f Merge pull request #27 from jsheard/patch-1
Fix Imagesift user agent
2024-08-04 21:53:48 +00:00
ai.robots.txt
0072b8f5f0 Removing previously generated files 2024-08-04 21:53:47 +00:00
ai.robots.txt
c7b781034e chore: restore FriendlyCrawler + ImageSift 2024-08-04 19:29:01 +00:00
ai.robots.txt
9a8fa66772 Removing previously generated files 2024-08-04 19:29:00 +00:00
ai.robots.txt
8de5bc8e01 Merge pull request #25 from mirium999/add_icc_crawler
Add ICC-Crawler
2024-08-04 01:21:56 +00:00
ai.robots.txt
8c632e1ba4 Removing previously generated files 2024-08-04 01:21:55 +00:00
ai.robots.txt
ffbad453f3 Merge pull request #24 from nisbet-hubbard/patch-5
Add last line of defence to FAQ
2024-08-03 14:27:47 +00:00
ai.robots.txt
b1907d86be Removing previously generated files 2024-08-03 14:27:46 +00:00
ai.robots.txt
d8de1ebdd5 chore: contribution note 2024-08-02 16:32:00 +00:00
ai.robots.txt
9d8d3de8ed Removing previously generated files 2024-08-02 16:31:59 +00:00
ai.robots.txt
b144225ece chore: drop in additional data 2024-08-01 22:33:23 +00:00
ai.robots.txt
06b950bce9 Removing previously generated files 2024-08-01 22:33:23 +00:00
ai.robots.txt
f18f0d99b9 chore: remove test data 2024-08-01 22:29:02 +00:00
ai.robots.txt
747cc834c4 Removing previously generated files 2024-08-01 22:29:01 +00:00
nisbet-hubbard
df89722038
Add PetalBot (and facebookexternalhit?) 2024-07-31 18:27:29 +08:00
fa7b64ae4b
chore: add Scrapy 2024-07-30 10:28:46 -07:00
55b4505e30
chore: add Timpibot 2024-07-29 12:38:22 -07:00
d49e860b74
chore: add VelenPublicWebCrawler 2024-07-29 12:12:42 -07:00
6e323554c6
chore: add Meta-ExternalAgent 2024-07-29 08:27:31 -07:00
2972926532
chore: add OAI-SearchBot 2024-07-26 09:06:10 -07:00
af52578965
chore: drop google adbot; add GoogleOther bots 2024-07-16 12:05:34 -07:00
0ca6bce87e
chore: add ImagesiftBot 2024-07-09 17:41:32 -07:00
0971af19b6
chore: peer39 unrelated to ai 2024-07-09 17:39:51 -07:00
89de2d2d91
chore: resolve conflict 2024-06-20 08:12:24 -07:00
a90ee5e9f0
chore: clean up bots and narrow scope 2024-06-20 08:09:21 -07:00
nisbet-hubbard
56c2285462
Update robots.txt 2024-06-20 11:31:08 +08:00
3f65a93891
chore: keeps Applebot-Extended in favor of Applebot as the latter is simply for search 2024-06-15 09:25:17 -07:00
Christopher Kirk-Nielsen
39363fc813
Block Applebot-Extended
Per [Apple's docs](https://support.apple.com/en-us/119829#datausage) ([via Matthew Bogart](https://matthewbogart.com/@matt/112605297864483766))
2024-06-12 16:43:12 -04:00
dea035365f
chore: add Diffbot and scoopit 2024-05-05 14:50:04 -07:00
Cory Dransfeldt
118ec00126
chore: add img2dataset to robots.txt 2024-04-22 09:26:59 -07:00
Cory Dransfeldt
d6d40989f4
chore: add FriendlyCrawler to robots.txt 2024-04-08 12:40:59 -07:00
Cory Dransfeldt
47fc45f2f9
chore: add PiplBot 2024-04-06 20:25:28 -07:00
Cory Dransfeldt
46c8c9adb3
chore: add Meltwater 2024-04-03 08:56:30 -07:00
Cory Dransfeldt
c8a6d7f02d
chore: add Seekr 2024-04-03 08:56:12 -07:00
--Explosion--
3e57b5ab5d
Add GoogleOther
Used by Google to crawl for internal research and development. It’s unknown what exactly this entails, but is a generic user agent that is used when no other appropriate user agent is available. Documentation available from Google: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers
2024-03-28 10:00:58 -05:00
Cory Dransfeldt
297071a664
Update robots.txt 2024-03-27 11:41:12 -07:00
Cory Dransfeldt
8aeddbdce8
Create robots.txt 2024-03-27 10:59:01 -07:00