robots.txt Generator
Control Search Engines and AI Crawlers

The only robots.txt generator that ships with the full 2026 AI crawler set - GPTBot, ClaudeBot, Google-Extended, PerplexityBot, Applebot-Extended, and 10 more - plus a companion llms.txt generator for AI discoverability. 100% client-side. Zero data stored.

AI-aware 15 AI bots covered Pairs with llms.txt

robots.txt Generator Tool

User Agents

For AI crawlers (GPTBot, ClaudeBot, etc.), use the AI Crawler Control card below.

AI Crawler Control

RJL.io exclusive

AI bots have two jobs: building training data and answering live user questions. Most generators lump them together. We don't.

Block to opt out of model training. Click a bot to add it with Disallow: /.

Fetch live to answer user questions and usually cite the source. Block only if you also want to opt out of AI search referrals.

Pair with /llms.txt - the AI-era robots.txt that gives AI crawlers a curated entry point to your best content. Open llms.txt Generator

Common Disallow Patterns

Click to add to the first user-agent:

Sitemap URLs

Crawl Delay

Note: Google ignores Crawl-delay. Use Google Search Console instead.

Presets

Generated robots.txt

Place the robots.txt file in your website's root directory (e.g., https://example.com/robots.txt).

Crawler Reference

Search engines, AI training crawlers, and AI answer-engine bots - what each one does and why you might want to allow or block it.

Search engines

Googlebot

Google's main web crawler for search indexing.

Bingbot

Microsoft Bing's web crawler.

Baiduspider

Baidu's web crawler (China).

AI training crawlers

Build or refresh a model's training corpus. Block to opt out of training.

GPTBot

OpenAI's crawler for training GPT models.

ClaudeBot

Anthropic's current training crawler. Replaces the deprecated Claude-Web.

anthropic-ai

Anthropic's legacy training token. Still honored - include it for older configs.

Google-Extended

Google's opt-out token for Gemini and Vertex AI training. Does not affect search indexing.

Applebot-Extended

Apple's opt-out token for Apple Intelligence training. Does not affect Siri/Spotlight indexing.

PerplexityBot

Perplexity's training crawler (separate from its live answer agent).

CCBot

Common Crawl - the open dataset that feeds many open-source LLMs.

Bytespider

ByteDance/TikTok crawler used for AI training.

Amazonbot

Amazon's general crawler, used for Alexa and AI products.

Meta-ExternalAgent

Meta's AI training crawler (Llama and Meta AI products).

DuckAssistBot

DuckDuckGo's AI assist crawler.

cohere-ai

Cohere's training crawler.

AI answer-engine crawlers

Fetch a single page live to answer a user's question. Usually cite the source. Blocking these forfeits AI referral traffic.

ChatGPT-User

ChatGPT's browsing/web tool. Fetches on demand when a user asks ChatGPT about a URL or topic.

Perplexity-User

Perplexity's live answer agent. Pulls fresh pages to compose cited answers.

OAI-SearchBot

OpenAI's index crawler for SearchGPT-style features. Block to opt out of OpenAI search.

Frequently Asked Questions

Why We Built robots.txt Generator

Writing robots.txt files seems simple, but getting the syntax right matters. A misplaced directive can accidentally block search engines from your entire site, devastating your SEO. We built this tool to help developers create correct robots.txt files visually.

Where we differ from every other robots.txt generator: most tools still ship a 2023-era AI bot list - Claude-Web (deprecated), no ClaudeBot, no Google-Extended, no Applebot-Extended, no PerplexityBot, no Meta-ExternalAgent. If you copy their output you silently fail to opt out of half the models trained on your site. We use the same canonical 15-bot list our SEO Analyzer checks for, and we keep the two on the same page so a robots.txt our generator writes is one the analyzer will grade green.

We also split the AI side in two: training crawlers (block to opt out of model training) and answer-engine crawlers (block only if you also want to opt out of AI search referrals). That distinction matters - blocking ChatGPT-User to "stop AI" is a different decision than blocking GPTBot, and most generators do not surface it at all.

Finally, robots.txt is only half the AI crawler conversation in 2026. The other half is llms.txt, a Markdown index that gives AI crawlers a curated entry point to your best content. We built a generator for that too, on the same site, with the same privacy guarantees.

robots.txt Generator is part of RJL.io's collection of free developer tools - each designed to do one thing exceptionally well, with no accounts, no tracking, and no data collection. Check out our other tools: llms.txt Generator, SEO Analyzer, .htaccess Generator, Meta Tag Generator, and more.

Looking for more developer tools to streamline your workflow?

Explore our growing collection of free, privacy-focused utilities designed by developers, for developers.

Discover All RJL.io Tools