The most comprehensive robots.txt builder online. Control AI crawlers, block bad bots, apply CMS-specific rules for WordPress, Shopify, Wix, and more. Generate, preview, and download your file instantly.
Already have a robots.txt? Upload it to populate all settings automatically and see exactly what you have.
Enter your site URL and select your platform. Your URL is used to auto-build your sitemap path and is never stored.
These apply to all crawlers (User-agent: *). Select common patterns to disallow.
Platform-specific paths pre-selected based on your chosen CMS. Toggle any individually.
Note: /wp-admin/admin-ajax.php is auto-allowed when admin is blocked.
No CMS-specific patterns for static HTML. Use Global Rules and the custom area below.
Use Global Rules and the custom area below for your platform.
Block transactional and user-specific pages that offer no SEO value and waste crawl budget.
Optionally add explicit Allow rules for major search engines. Useful after restrictive global rules. Note: only Googlebot understands Allow.
Control access for AI and LLM crawlers individually. Each bot can be allowed, blocked, or left unspecified (inherits global rules).
Block SEO scrapers and spam crawlers that waste bandwidth and scrape your content for competitor intelligence tools.
Add your sitemap URL and configure optional advanced directives.
Everything you need to know about robots.txt in 2026, from basic directives to AI crawler controls.
A plain-text file placed at the root of your domain (yoursite.com/robots.txt) that tells crawlers which pages they can or cannot access. Crawlers check this file before visiting any page on your site.
User-agent: specifies which bot the rules apply to (* = all bots)Disallow: tells bots not to crawl a pathAllow: overrides a Disallow for a sub-path (Googlebot only)Sitemap: points crawlers to your XML sitemapAI crawlers have exploded in 2026. GPTBot traffic grew 305% year-over-year. Each major AI company now runs three separate bots: one for training, one for user-initiated fetches, and one for search indexing. This lets you allow AI search visibility while blocking training data scraping.
Note: Perplexity has documented compliance issues using undeclared user-agent strings. robots.txt alone may not fully stop it.
Disallow: / for all bots and forgetting to re-allow search enginesrobots.txt controls whether bots can crawl and index your pages. It is the established standard, respected by all major search engines and AI crawlers.
llms.txt is an emerging (not yet standardized) Markdown file at yoursite.com/llms.txt that guides AI assistants on how to interpret, summarize, and cite your content. Claude (Anthropic) officially supports it. Google does not yet.
User-agent: GPTBot followed by Disallow: /. OpenAI, Anthropic, and Google all formally respect robots.txt. The "Block Training Only" preset allows search/answer bots while blocking training crawlers.Crawl-delay for years. Bing and Yandex still support it. Modern bots automatically detect server stress via HTTP 429 responses. If you have a genuinely resource-limited server, use Google Search Console to set crawl rate limits directly instead.yoursite.com/llms.txt that provides guidance to AI assistants about how to interpret your content. It is not yet a formal web standard and Google does not support it. Anthropic (Claude) officially endorses it. If AI visibility matters for your site, it is worth adding alongside your robots.txt./wp-admin/ (with /wp-admin/admin-ajax.php auto-allowed), /wp-includes/, /wp-json/, /xmlrpc.php, /trackback/, and /feed/. You can toggle each individually and add custom paths.