robots.txt Generator
Control Search Engines and AI Crawlers
The only robots.txt generator that ships with the full 2026 AI crawler set - GPTBot, ClaudeBot, Google-Extended, PerplexityBot, Applebot-Extended, and 10 more - plus a companion llms.txt generator for AI discoverability. 100% client-side. Zero data stored.
robots.txt Generator Tool
User Agents
For AI crawlers (GPTBot, ClaudeBot, etc.), use the AI Crawler Control card below.
AI Crawler Control
RJL.io exclusiveAI bots have two jobs: building training data and answering live user questions. Most generators lump them together. We don't.
Block to opt out of model training. Click a bot to add it with Disallow: /.
Fetch live to answer user questions and usually cite the source. Block only if you also want to opt out of AI search referrals.
Pair with /llms.txt - the AI-era robots.txt that gives AI crawlers a curated entry point to your best content. Open llms.txt Generator
Common Disallow Patterns
Click to add to the first user-agent:
Sitemap URLs
Crawl Delay
Note: Google ignores Crawl-delay. Use Google Search Console instead.
Presets
Generated robots.txt
Crawler Reference
Search engines, AI training crawlers, and AI answer-engine bots - what each one does and why you might want to allow or block it.
Search engines
Googlebot
Google's main web crawler for search indexing.
Bingbot
Microsoft Bing's web crawler.
Baiduspider
Baidu's web crawler (China).
AI training crawlers
Build or refresh a model's training corpus. Block to opt out of training.
GPTBot
OpenAI's crawler for training GPT models.
ClaudeBot
Anthropic's current training crawler. Replaces the deprecated Claude-Web.
anthropic-ai
Anthropic's legacy training token. Still honored - include it for older configs.
Google-Extended
Google's opt-out token for Gemini and Vertex AI training. Does not affect search indexing.
Applebot-Extended
Apple's opt-out token for Apple Intelligence training. Does not affect Siri/Spotlight indexing.
PerplexityBot
Perplexity's training crawler (separate from its live answer agent).
CCBot
Common Crawl - the open dataset that feeds many open-source LLMs.
Bytespider
ByteDance/TikTok crawler used for AI training.
Amazonbot
Amazon's general crawler, used for Alexa and AI products.
Meta-ExternalAgent
Meta's AI training crawler (Llama and Meta AI products).
DuckAssistBot
DuckDuckGo's AI assist crawler.
cohere-ai
Cohere's training crawler.
AI answer-engine crawlers
Fetch a single page live to answer a user's question. Usually cite the source. Blocking these forfeits AI referral traffic.
ChatGPT-User
ChatGPT's browsing/web tool. Fetches on demand when a user asks ChatGPT about a URL or topic.
Perplexity-User
Perplexity's live answer agent. Pulls fresh pages to compose cited answers.
OAI-SearchBot
OpenAI's index crawler for SearchGPT-style features. Block to opt out of OpenAI search.
Frequently Asked Questions
Why We Built robots.txt Generator
Writing robots.txt files seems simple, but getting the syntax right matters. A misplaced directive can accidentally block search engines from your entire site, devastating your SEO. We built this tool to help developers create correct robots.txt files visually.
Where we differ from every other robots.txt generator: most tools still ship a 2023-era AI bot list - Claude-Web (deprecated), no ClaudeBot, no Google-Extended, no Applebot-Extended, no PerplexityBot, no Meta-ExternalAgent. If you copy their output you silently fail to opt out of half the models trained on your site. We use the same canonical 15-bot list our SEO Analyzer checks for, and we keep the two on the same page so a robots.txt our generator writes is one the analyzer will grade green.
We also split the AI side in two: training crawlers (block to opt out of model training) and answer-engine crawlers (block only if you also want to opt out of AI search referrals). That distinction matters - blocking ChatGPT-User to "stop AI" is a different decision than blocking GPTBot, and most generators do not surface it at all.
Finally, robots.txt is only half the AI crawler conversation in 2026. The other half is llms.txt, a Markdown index that gives AI crawlers a curated entry point to your best content. We built a generator for that too, on the same site, with the same privacy guarantees.
robots.txt Generator is part of RJL.io's collection of free developer tools - each designed to do one thing exceptionally well, with no accounts, no tracking, and no data collection. Check out our other tools: llms.txt Generator, SEO Analyzer, .htaccess Generator, Meta Tag Generator, and more.