Robots.txt Generator
Generate a valid robots.txt file in seconds. Set user-agents, allow and disallow paths, crawl delay, and sitemap URL — no manual editing required.
Use * for all crawlers or a specific bot name, e.g. Googlebot
Optional. Leave blank to omit.
One path per line. Use / to block everything.
Override a disallow rule for a specific sub-path.
Optional. Google ignores this; useful for smaller crawlers.
User-agent: * Disallow: /admin/ Disallow: /private/ Sitemap: https://example.com/sitemap.xml
What is a robots.txt file and why does it matter?
A robots.txt file is a plain-text document you place at the root of your website
(https://yourdomain.com/robots.txt). It implements the
Robots Exclusion Protocol, signalling to well-behaved search engine crawlers which parts
of your site they should or should not request. Getting it right keeps crawl budget focused on
the pages that matter and protects staging areas, admin panels, or duplicate content from
appearing in search indices.
How to use this generator
- User-agent — target a specific bot (e.g.
Googlebot) or use*to apply rules to every crawler. - Disallow paths — enter one path per line. Use
/to block everything or/admin/to block a directory. - Allow paths — carve exceptions out of a broader disallow rule.
- Crawl-delay — optional seconds between requests; respected by some but not all crawlers (Google ignores it).
- Sitemap URL — adding your sitemap here lets crawlers discover it even without a Search Console submission.
Quick presets
Use the Allow all preset to emit a permissive robots.txt that lets every bot crawl every page — the right choice for most public sites. Switch to Block all when you want to prevent all crawling, such as on a staging environment or before a site launch.
Privacy
All output is generated locally in your browser. Your paths, sitemap URLs, and configuration are never transmitted to any server.
Frequently Asked Questions
What is a robots.txt file?↓
A robots.txt file sits at the root of your domain (e.g. https://example.com/robots.txt) and tells search engine crawlers which pages or directories they may or may not access. It is a plain-text file following the Robots Exclusion Protocol — a widely adopted de-facto standard supported by Google, Bing, and every major search engine.
Does robots.txt actually block indexing?↓
No — it only controls whether a crawler may fetch a URL, not whether the URL appears in search results. A disallowed page can still be indexed if other sites link to it. To prevent indexing, combine robots.txt disallow rules with a <meta name="robots" content="noindex"> tag or an X-Robots-Tag HTTP response header on the target page.
Should I use Allow: / or leave Allow empty to permit all crawling?↓
Leaving both Allow and Disallow empty (or only specifying "User-agent: *") already permits full crawling — it is the default. An explicit "Allow: /" is only meaningful when paired with a more specific "Disallow:" rule, such as "Disallow: /private/" and "Allow: /private/press-kit/", to carve out a sub-path from a broader block.