mirror of
https://github.com/ai-robots-txt/ai.robots.txt.git
synced 2025-04-04 19:13:57 +00:00

For these simple tests Python's built-in unittest framework is more than enough. No additional dependencies are required. Added some more test cases with "special" characters to test the escaping code better.
47 lines
1.2 KiB
Text
47 lines
1.2 KiB
Text
User-agent: AI2Bot
|
|
User-agent: Ai2Bot-Dolma
|
|
User-agent: Amazonbot
|
|
User-agent: anthropic-ai
|
|
User-agent: Applebot
|
|
User-agent: Applebot-Extended
|
|
User-agent: Bytespider
|
|
User-agent: CCBot
|
|
User-agent: ChatGPT-User
|
|
User-agent: Claude-Web
|
|
User-agent: ClaudeBot
|
|
User-agent: cohere-ai
|
|
User-agent: Diffbot
|
|
User-agent: FacebookBot
|
|
User-agent: facebookexternalhit
|
|
User-agent: FriendlyCrawler
|
|
User-agent: Google-Extended
|
|
User-agent: GoogleOther
|
|
User-agent: GoogleOther-Image
|
|
User-agent: GoogleOther-Video
|
|
User-agent: GPTBot
|
|
User-agent: iaskspider/2.0
|
|
User-agent: ICC-Crawler
|
|
User-agent: ImagesiftBot
|
|
User-agent: img2dataset
|
|
User-agent: ISSCyberRiskCrawler
|
|
User-agent: Kangaroo Bot
|
|
User-agent: Meta-ExternalAgent
|
|
User-agent: Meta-ExternalFetcher
|
|
User-agent: OAI-SearchBot
|
|
User-agent: omgili
|
|
User-agent: omgilibot
|
|
User-agent: PerplexityBot
|
|
User-agent: PetalBot
|
|
User-agent: Scrapy
|
|
User-agent: Sidetrade indexer bot
|
|
User-agent: Timpibot
|
|
User-agent: VelenPublicWebCrawler
|
|
User-agent: Webzio-Extended
|
|
User-agent: YouBot
|
|
User-agent: crawler.with.dots
|
|
User-agent: star***crawler
|
|
User-agent: Is this a crawler?
|
|
User-agent: a[mazing]{42}(robot)
|
|
User-agent: 2^32$
|
|
User-agent: curl|sudo bash
|
|
Disallow: /
|