Skip to main content
EvvyTools.com EvvyTools.com

Navigate

Home Tools Data Lists About Blog Contact

Tool Categories

Home & Real Estate Health & Fitness Freelance & Business Everyday Calculators Writing & Content Dev & Tech Cooking & Kitchen Personal Finance Math & Science

More

Subscribe Donate WordPress Plugin
Sign In Create Account

Robots.txt Generator - Build & Validate Free

Build and validate robots.txt files visually

Build a valid robots.txt file visually without memorizing syntax. Start from a preset for your platform, add custom rules, and validate everything before deploying — no more accidentally blocking your entire site from Google.

Pro tip: Never block CSS and JavaScript files in robots.txt — Google needs to render your pages to evaluate content and mobile-friendliness. Blocking /wp-content/ or /assets/ is one of the most common robots.txt mistakes and can silently tank your search rankings for months.

Optional but strongly recommended for SEO. Enter the full URL to your XML sitemap.
Optional. Google ignores Crawl-delay, but Bing and Yandex respect it. Use only if your server struggles under heavy crawling.
Generated robots.txt
Rules
0
User-Agents
0
File Size
0 B
Validation Report
    /
    Enter a path to test
    Define different rules per crawler. Perfect for blocking AI crawlers (GPTBot, CCBot) while allowing search engines full access.
    Multi-bot rule builder requires subscription
    Enter up to 10 URL paths from your site to see which are crawlable and which are blocked under your current rules.
    /
    Crawl simulator requires subscription
    Paste your current robots.txt below to edit it visually, validate it, and catch mistakes.
    Import & edit mode requires subscription
    Save requires subscription

    What Is robots.txt and How Does It Work?

    Every website can include a plain-text file at /robots.txt that tells search engine crawlers and other automated bots which parts of the site they may or may not access. When Googlebot, Bingbot, or any well-behaved crawler arrives at your domain, the first thing it requests is https://yourdomain.com/robots.txt. If the file exists, the crawler reads it line by line and obeys the directives inside — skipping paths you have marked as Disallow and freely crawling anything you have explicitly or implicitly allowed. It is important to understand that robots.txt is advisory, not enforceable; malicious scrapers can ignore it entirely. However, every major search engine and most legitimate bots honour the protocol. The file must be placed at the root of your domain (not a subdirectory) and must be served as plain text with a 200 status code. If the server returns a 404 for robots.txt, crawlers assume everything is fair game.

    robots.txt Syntax Reference Guide

    The format is deceptively simple but has important nuances that trip up even experienced developers. Every block starts with a User-agent line that names the crawler the rules apply to — use * for all bots or a specific name like Googlebot for targeted rules. Below the User-agent line you place one or more Disallow directives (paths the bot may not visit) and optional Allow directives (paths within a disallowed tree that should remain accessible). Paths are case-sensitive and use prefix matching: Disallow: /admin blocks /admin, /admin/, and /admin/settings/users alike. Two wildcard characters are supported in the Google and Bing implementations: * matches any sequence of characters and $ anchors the match to the end of the URL. For example, Disallow: /*.pdf$ blocks all URLs ending in .pdf anywhere on the site. The Sitemap directive tells crawlers where your XML sitemap lives and can appear anywhere in the file. Google ignores the Crawl-delay directive, but Bing, Yandex, and several other engines respect it as a polite request to pause between requests.

    Common robots.txt Mistakes That Hurt SEO

    The single most damaging mistake is accidentally blocking CSS and JavaScript resources that Google needs to render your page. If Googlebot cannot load your stylesheets and scripts, it cannot evaluate mobile-friendliness or understand dynamic content — and your rankings suffer silently because Google Search Console may not surface the issue prominently. A close second is the bare Disallow: / with no further Allow directives, which blocks the entire site. This happens more often than you would expect, especially on staging sites that go live without removing the development robots.txt. Other frequent issues include:

    • Conflicting rules — having both Allow: /blog and Disallow: /blog under the same User-agent creates ambiguity. Google resolves conflicts by favouring the most specific path, but other crawlers may not.
    • Missing trailing slashDisallow: /admin also blocks /administrator because robots.txt uses prefix matching, not exact matching.
    • No sitemap declaration — while not strictly required, omitting the Sitemap directive forces crawlers to discover your sitemap through other channels, which can delay indexing of new content.
    • Blocking query-parameter pagesDisallow: /? blocks all URLs with query strings, which can inadvertently hide paginated content, search results, or filtered product pages that you actually want indexed.

    robots.txt Examples for WordPress, Shopify, and Custom Sites

    A well-configured WordPress robots.txt blocks /wp-admin/ (the dashboard) while explicitly allowing /wp-admin/admin-ajax.php (required by many plugins for front-end functionality). It disallows /wp-includes/ to prevent crawlers from indexing raw PHP templates, and it may block /author/ archives if thin-content author pages do not add SEO value. Crucially, it does not block /wp-content/uploads/ (your media), /wp-content/themes/ (your CSS), or /wp-content/plugins/ (your scripts). Shopify stores have a platform-generated robots.txt that blocks admin paths, checkout URLs, cart pages, and internal search. You can customise it via the robots.txt.liquid template if you need additional rules. For custom-built sites — whether Next.js, Laravel, Rails, or static — the principle is the same: block administrative, internal, and duplicate-content paths while leaving all user-facing content, assets, and sitemaps accessible. A Next.js app typically blocks /_next/static/ build hashes from being indexed as content pages while allowing everything else, and a Laravel project blocks /storage/, /vendor/, and /nova/ (if using Nova).

    How to Test Your robots.txt File

    Before deploying a new robots.txt, always test it. Google Search Console has a robots.txt Tester tool under Crawl → robots.txt Tester that lets you enter any URL on your site and see whether the current robots.txt allows or blocks it for Googlebot. The tool on this page provides similar functionality without leaving the browser — enter a path like /admin/settings, choose a user-agent, and instantly see the verdict. Beyond testing individual URLs, review the validation report for systemic issues: conflicting rules, blocked assets, or a missing sitemap. Once deployed, monitor Google Search Console’s Coverage report for “Blocked by robots.txt” errors — this is the fastest way to catch rules that accidentally exclude important pages. Remember that changes to robots.txt can take hours or even days to propagate to Google; the file is cached on their end. If you need an urgent recrawl, use the URL Inspection tool to request indexing of specific pages, or submit an updated sitemap.

    Looking for more developer tools? Explore all Dev & Tech tools on EvvyTools.

    Link copied to clipboard!