Paste your URLs, set change frequency and priority defaults, and get a valid XML sitemap ready for Google Search Console in seconds — complete with duplicate detection, syntax validation, and auto-splitting for large sites.
Pro tip: Don’t set every URL to priority 1.0 —
when everything is highest priority, nothing is. Google treats priority as a relative
signal within your own site. Reserve 1.0 for your homepage, 0.8 for key landing pages,
and let most content sit at 0.5.
<image:image> tags.<xhtml:link rel="alternate"> tags per the Google specification.What Is an XML Sitemap and Why Does Your Site Need One?
An XML sitemap is a structured file that lists every URL on your website you want search engines to discover and index. It follows a strict XML schema defined by the sitemaps.org protocol (supported by Google, Bing, Yahoo, and Yandex). While search engines can discover pages through links, a sitemap acts as a direct road map — ensuring new pages, deep content, and orphaned URLs don’t slip through the cracks during crawling. Sites with more than a few dozen pages, frequently updated content, rich media, or poor internal linking benefit most. Google’s own documentation states that sitemaps are “especially useful” for large sites, sites with isolated pages, and new domains with few inbound links. Without a sitemap, Google relies entirely on following links, which means newly published pages can take days or weeks to appear in search results.
XML Sitemap Structure: Tags, Fields, and Limits
A standard XML sitemap uses the <urlset> root element with the namespace
http://www.sitemaps.org/schemas/sitemap/0.9. Each URL entry is wrapped in a
<url> element containing up to four child tags: <loc>
(the absolute URL — required), <lastmod> (the date the page was
last modified in W3C Datetime format), <changefreq> (a hint about how
often the content changes), and <priority> (a value from 0.0 to 1.0
indicating the relative importance within your site). The protocol imposes two hard limits:
a single sitemap file cannot contain more than 50,000 URLs and cannot exceed
50 MB uncompressed. If your site exceeds either limit, you must split the
sitemap into multiple files and reference them from a sitemap index file.
This tool handles that automatically — when your URL count crosses the 50,000 threshold,
it generates a sitemap index with properly segmented child sitemaps.
Understanding changefreq and priority
The <changefreq> tag tells crawlers how frequently a page is likely to change.
Valid values range from always (pages that change with every access, like stock
tickers) through hourly, daily, weekly,
monthly, yearly, to never (archived URLs that
will not change again). It is important to understand that this is a hint, not a
directive — Google may crawl a “yearly” page daily if it detects frequent
changes, or skip a “daily” page for weeks if the content never actually updates.
The <priority> tag is often misunderstood. It is a relative signal
within your own site, not an absolute ranking factor. Setting every page to 1.0 is the same as
setting every page to 0.5 — the value only matters in comparison to other pages on your
domain. A sensible strategy: homepage at 1.0, key category or landing pages at 0.8, blog posts
and product pages at 0.5–0.7, and utility pages (privacy, terms) at 0.1–0.3.
Google has stated publicly that it largely ignores <priority>, but Bing and
Yandex may use it as a lightweight signal when deciding crawl order.
When to Use a Sitemap Index File
A sitemap index is a wrapper file that lists the locations of multiple sitemap files. It uses
the <sitemapindex> root element and contains one or more
<sitemap> entries, each with a <loc> pointing to a child
sitemap and an optional <lastmod>. You need a sitemap index when your site
has more than 50,000 URLs, when you want to segment sitemaps by content type (products, blog,
pages, images), or when different sections of your site update at different frequencies. Large
e-commerce sites commonly have separate sitemaps for product pages, category pages, and blog
posts — this lets search engines prioritize recrawling the most volatile segments without
processing the entire index. The sitemap index itself can contain up to 50,000 sitemap
references, giving you a theoretical maximum of 2.5 billion URLs across all sub-sitemaps.
Image Sitemaps and Rich Media Indexing
Standard sitemaps tell Google about HTML pages, but images loaded via JavaScript, lazy loading,
or CSS backgrounds may not be discoverable through normal crawling. An image sitemap extends the
standard format with the image namespace and <image:image> child
elements inside each <url> block. Each image entry requires at minimum an
<image:loc> (the absolute URL to the image file) and can optionally include
<image:caption>, <image:title>,
<image:geo_location>, and <image:license>. Google supports
up to 1,000 image entries per page URL. This is especially important for photography sites,
e-commerce product galleries, and portfolio sites where images are the primary content. Including
images in your sitemap increases their chances of appearing in Google Images results, which can
drive significant traffic for visual-heavy niches.
Hreflang in Sitemaps for International SEO
If your site serves content in multiple languages or targets different regions, you can declare
hreflang relationships directly in your sitemap using the xhtml namespace. Each
<url> entry includes one or more
<xhtml:link rel="alternate" hreflang="[lang]" href="[url]"/>
elements that tell search engines which version of the page to show users in each language or
region. The hreflang value uses ISO 639-1 language codes optionally combined with ISO 3166-1
Alpha-2 region codes (for example, en-US for American English, fr for
French regardless of region, x-default for the fallback). A critical rule: every
page that declares alternates must include itself in the list, and the relationship
must be reciprocal — page A must point to page B, and page B must point back to page A.
Missing reciprocal links are the single most common hreflang implementation error and cause
Google to ignore the annotations entirely.
Common XML Sitemap Mistakes to Avoid
The most frequent mistake is including URLs that return non-200 status codes. Every URL in your sitemap should return a 200 OK response — listing 301 redirects, 404 pages, or 500 errors wastes crawl budget and signals poor site maintenance. Other common issues include:
- Including noindexed pages — if a page has a
noindexmeta tag or HTTP header, it should not appear in the sitemap. Google will flag this as a conflict in Search Console. - Protocol mismatches — mixing
http://andhttps://URLs, or inconsistentwwwvs non-wwwusage, creates duplicate entries that confuse crawlers. - Stale lastmod dates — setting every page to today’s date defeats the purpose. Google uses lastmod to prioritize recrawling, so inaccurate dates reduce its trust in your sitemap data.
- Missing the XML declaration — while most parsers handle it gracefully,
the sitemap protocol requires
<?xml version="1.0" encoding="UTF-8"?>as the first line. - Unescaped special characters — ampersands in query strings must be
escaped as
&, and angle brackets in URLs (rare but possible) need their respective XML entities.
Looking for more developer tools? Explore all Dev & Tech tools on EvvyTools.