Commit graph

125 commits

Author SHA1 Message Date
ai.robots.txt
f18f0d99b9 chore: remove test data 2024-08-01 22:29:02 +00:00
ai.robots.txt
747cc834c4 Removing previously generated files 2024-08-01 22:29:01 +00:00
nisbet-hubbard
df89722038
Add PetalBot (and facebookexternalhit?) 2024-07-31 18:27:29 +08:00
fa7b64ae4b
chore: add Scrapy 2024-07-30 10:28:46 -07:00
55b4505e30
chore: add Timpibot 2024-07-29 12:38:22 -07:00
d49e860b74
chore: add VelenPublicWebCrawler 2024-07-29 12:12:42 -07:00
6e323554c6
chore: add Meta-ExternalAgent 2024-07-29 08:27:31 -07:00
2972926532
chore: add OAI-SearchBot 2024-07-26 09:06:10 -07:00
af52578965
chore: drop google adbot; add GoogleOther bots 2024-07-16 12:05:34 -07:00
0ca6bce87e
chore: add ImagesiftBot 2024-07-09 17:41:32 -07:00
0971af19b6
chore: peer39 unrelated to ai 2024-07-09 17:39:51 -07:00
89de2d2d91
chore: resolve conflict 2024-06-20 08:12:24 -07:00
a90ee5e9f0
chore: clean up bots and narrow scope 2024-06-20 08:09:21 -07:00
nisbet-hubbard
56c2285462
Update robots.txt 2024-06-20 11:31:08 +08:00
3f65a93891
chore: keeps Applebot-Extended in favor of Applebot as the latter is simply for search 2024-06-15 09:25:17 -07:00
Christopher Kirk-Nielsen
39363fc813
Block Applebot-Extended
Per [Apple's docs](https://support.apple.com/en-us/119829#datausage) ([via Matthew Bogart](https://matthewbogart.com/@matt/112605297864483766))
2024-06-12 16:43:12 -04:00
dea035365f
chore: add Diffbot and scoopit 2024-05-05 14:50:04 -07:00
Cory Dransfeldt
118ec00126
chore: add img2dataset to robots.txt 2024-04-22 09:26:59 -07:00
Cory Dransfeldt
d6d40989f4
chore: add FriendlyCrawler to robots.txt 2024-04-08 12:40:59 -07:00
Cory Dransfeldt
47fc45f2f9
chore: add PiplBot 2024-04-06 20:25:28 -07:00
Cory Dransfeldt
46c8c9adb3
chore: add Meltwater 2024-04-03 08:56:30 -07:00
Cory Dransfeldt
c8a6d7f02d
chore: add Seekr 2024-04-03 08:56:12 -07:00
--Explosion--
3e57b5ab5d
Add GoogleOther
Used by Google to crawl for internal research and development. It’s unknown what exactly this entails, but is a generic user agent that is used when no other appropriate user agent is available. Documentation available from Google: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers
2024-03-28 10:00:58 -05:00
Cory Dransfeldt
297071a664
Update robots.txt 2024-03-27 11:41:12 -07:00
Cory Dransfeldt
8aeddbdce8
Create robots.txt 2024-03-27 10:59:01 -07:00