ai.robots.txt
|
f18f0d99b9
|
chore: remove test data
|
2024-08-01 22:29:02 +00:00 |
|
ai.robots.txt
|
747cc834c4
|
Removing previously generated files
|
2024-08-01 22:29:01 +00:00 |
|
nisbet-hubbard
|
df89722038
|
Add PetalBot (and facebookexternalhit ?)
|
2024-07-31 18:27:29 +08:00 |
|
|
fa7b64ae4b
|
chore: add Scrapy
|
2024-07-30 10:28:46 -07:00 |
|
|
55b4505e30
|
chore: add Timpibot
|
2024-07-29 12:38:22 -07:00 |
|
|
d49e860b74
|
chore: add VelenPublicWebCrawler
|
2024-07-29 12:12:42 -07:00 |
|
|
6e323554c6
|
chore: add Meta-ExternalAgent
|
2024-07-29 08:27:31 -07:00 |
|
|
2972926532
|
chore: add OAI-SearchBot
|
2024-07-26 09:06:10 -07:00 |
|
|
af52578965
|
chore: drop google adbot; add GoogleOther bots
|
2024-07-16 12:05:34 -07:00 |
|
|
0ca6bce87e
|
chore: add ImagesiftBot
|
2024-07-09 17:41:32 -07:00 |
|
|
0971af19b6
|
chore: peer39 unrelated to ai
|
2024-07-09 17:39:51 -07:00 |
|
|
89de2d2d91
|
chore: resolve conflict
|
2024-06-20 08:12:24 -07:00 |
|
|
a90ee5e9f0
|
chore: clean up bots and narrow scope
|
2024-06-20 08:09:21 -07:00 |
|
nisbet-hubbard
|
56c2285462
|
Update robots.txt
|
2024-06-20 11:31:08 +08:00 |
|
|
3f65a93891
|
chore: keeps Applebot-Extended in favor of Applebot as the latter is simply for search
|
2024-06-15 09:25:17 -07:00 |
|
Christopher Kirk-Nielsen
|
39363fc813
|
Block Applebot-Extended
Per [Apple's docs](https://support.apple.com/en-us/119829#datausage) ([via Matthew Bogart](https://matthewbogart.com/@matt/112605297864483766))
|
2024-06-12 16:43:12 -04:00 |
|
|
dea035365f
|
chore: add Diffbot and scoopit
|
2024-05-05 14:50:04 -07:00 |
|
Cory Dransfeldt
|
118ec00126
|
chore: add img2dataset to robots.txt
|
2024-04-22 09:26:59 -07:00 |
|
Cory Dransfeldt
|
d6d40989f4
|
chore: add FriendlyCrawler to robots.txt
|
2024-04-08 12:40:59 -07:00 |
|
Cory Dransfeldt
|
47fc45f2f9
|
chore: add PiplBot
|
2024-04-06 20:25:28 -07:00 |
|
Cory Dransfeldt
|
46c8c9adb3
|
chore: add Meltwater
|
2024-04-03 08:56:30 -07:00 |
|
Cory Dransfeldt
|
c8a6d7f02d
|
chore: add Seekr
|
2024-04-03 08:56:12 -07:00 |
|
--Explosion--
|
3e57b5ab5d
|
Add GoogleOther
Used by Google to crawl for internal research and development. It’s unknown what exactly this entails, but is a generic user agent that is used when no other appropriate user agent is available. Documentation available from Google: https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers
|
2024-03-28 10:00:58 -05:00 |
|
Cory Dransfeldt
|
297071a664
|
Update robots.txt
|
2024-03-27 11:41:12 -07:00 |
|
Cory Dransfeldt
|
8aeddbdce8
|
Create robots.txt
|
2024-03-27 10:59:01 -07:00 |
|