Build a valid robots.txt file visually without memorizing syntax. Start from a preset for your platform, add custom rules, and validate everything before deploying — no more accidentally blocking your entire site from Google.
Pro tip: Never block CSS and JavaScript files in robots.txt — Google
needs to render your pages to evaluate content and mobile-friendliness. Blocking
/wp-content/ or /assets/ is one of the most common robots.txt
mistakes and can silently tank your search rankings for months.
What Is robots.txt and How Does It Work?
Every website can include a plain-text file at /robots.txt that tells search
engine crawlers and other automated bots which parts of the site they may or may not access.
When Googlebot, Bingbot, or any well-behaved crawler arrives at your domain, the first thing
it requests is https://yourdomain.com/robots.txt. If the file exists, the crawler
reads it line by line and obeys the directives inside — skipping paths you have marked
as Disallow and freely crawling anything you have explicitly or implicitly
allowed. It is important to understand that robots.txt is advisory, not enforceable; malicious
scrapers can ignore it entirely. However, every major search engine and most legitimate bots
honour the protocol. The file must be placed at the root of your domain (not a subdirectory)
and must be served as plain text with a 200 status code. If the server returns a 404 for
robots.txt, crawlers assume everything is fair game.
robots.txt Syntax Reference Guide
The format is deceptively simple but has important nuances that trip up even experienced
developers. Every block starts with a User-agent line that names the crawler
the rules apply to — use * for all bots or a specific name like
Googlebot for targeted rules. Below the User-agent line you place one or more
Disallow directives (paths the bot may not visit) and optional
Allow directives (paths within a disallowed tree that should remain
accessible). Paths are case-sensitive and use prefix matching: Disallow: /admin
blocks /admin, /admin/, and /admin/settings/users
alike. Two wildcard characters are supported in the Google and Bing implementations:
* matches any sequence of characters and $ anchors the match to
the end of the URL. For example, Disallow: /*.pdf$ blocks all URLs ending in
.pdf anywhere on the site. The Sitemap directive tells crawlers
where your XML sitemap lives and can appear anywhere in the file. Google ignores the
Crawl-delay directive, but Bing, Yandex, and several other engines respect
it as a polite request to pause between requests.
Common robots.txt Mistakes That Hurt SEO
The single most damaging mistake is accidentally blocking CSS and JavaScript resources that
Google needs to render your page. If Googlebot cannot load your stylesheets and scripts, it
cannot evaluate mobile-friendliness or understand dynamic content — and your rankings
suffer silently because Google Search Console may not surface the issue prominently. A close
second is the bare Disallow: / with no further Allow directives, which blocks
the entire site. This happens more often than you would expect, especially on staging sites
that go live without removing the development robots.txt. Other frequent issues include:
- Conflicting rules — having both
Allow: /blogandDisallow: /blogunder the same User-agent creates ambiguity. Google resolves conflicts by favouring the most specific path, but other crawlers may not. - Missing trailing slash —
Disallow: /adminalso blocks/administratorbecause robots.txt uses prefix matching, not exact matching. - No sitemap declaration — while not strictly required, omitting the Sitemap directive forces crawlers to discover your sitemap through other channels, which can delay indexing of new content.
- Blocking query-parameter pages —
Disallow: /?blocks all URLs with query strings, which can inadvertently hide paginated content, search results, or filtered product pages that you actually want indexed.
robots.txt Examples for WordPress, Shopify, and Custom Sites
A well-configured WordPress robots.txt blocks /wp-admin/ (the dashboard) while
explicitly allowing /wp-admin/admin-ajax.php (required by many plugins for
front-end functionality). It disallows /wp-includes/ to prevent crawlers from
indexing raw PHP templates, and it may block /author/ archives if thin-content
author pages do not add SEO value. Crucially, it does not block
/wp-content/uploads/ (your media), /wp-content/themes/ (your CSS),
or /wp-content/plugins/ (your scripts). Shopify stores have a platform-generated
robots.txt that blocks admin paths, checkout URLs, cart pages, and internal search. You can
customise it via the robots.txt.liquid template if you need additional rules.
For custom-built sites — whether Next.js, Laravel, Rails, or static — the
principle is the same: block administrative, internal, and duplicate-content paths while
leaving all user-facing content, assets, and sitemaps accessible. A Next.js app typically
blocks /_next/static/ build hashes from being indexed as content pages while
allowing everything else, and a Laravel project blocks /storage/,
/vendor/, and /nova/ (if using Nova).
How to Test Your robots.txt File
Before deploying a new robots.txt, always test it. Google Search Console has a
robots.txt Tester tool under Crawl → robots.txt Tester that lets you
enter any URL on your site and see whether the current robots.txt allows or blocks it for
Googlebot. The tool on this page provides similar functionality without leaving the browser
— enter a path like /admin/settings, choose a user-agent, and instantly
see the verdict. Beyond testing individual URLs, review the validation report for systemic
issues: conflicting rules, blocked assets, or a missing sitemap. Once deployed, monitor
Google Search Console’s Coverage report for “Blocked by robots.txt” errors
— this is the fastest way to catch rules that accidentally exclude important pages.
Remember that changes to robots.txt can take hours or even days to propagate to Google;
the file is cached on their end. If you need an urgent recrawl, use the URL Inspection tool
to request indexing of specific pages, or submit an updated sitemap.
Looking for more developer tools? Explore all Dev & Tech tools on EvvyTools.