About the Sitemap Generator
The XML Sitemap Generator takes a paste-able URL list and produces a production-ready sitemap.xml that validates against the sitemaps.org schema. Handles URL trimming, duplicate removal, scheme validation, and the protocol limits (50,000 URLs / 50 MB per file). Above the limits it auto-splits into a sitemap-index referencing child sitemaps. Image-sitemap extensions and hreflang multi-language declarations are supported.
It is built for SEO managers feeding Google Search Console a clean sitemap after a site redesign, developers wiring up a sitemap.xml on a new launch, ecommerce sites with thousands of product URLs that need the index-file split, multilingual sites that need hreflang declarations Google will actually respect, and anyone whose CMS-generated sitemap somehow includes 404 URLs.
All generation runs locally in your browser. URL list, generated XML, and any custom priority / changefreq values never leave your device. The page makes no network call after first load. Pre-launch site structures are competitively sensitive (especially for new product launches and redesigns); the generator never sees them.
Two things to remember: Google has ignored priority and changefreq values for years and treats them as hints at best — they don’t influence ranking. Their value is internal weighting for your own crawl planning. And include only canonical, indexable, status-200 URLs — sitemaps full of redirects, 404s, or noindex pages train Google to distrust the file. Pair with a robots.txt that references the sitemap, submit in Search Console, and re-submit after structural changes.
<image:image> tags.<xhtml:link rel="alternate"> tags per the Google specification.What Is an XML Sitemap and Why Does Your Site Need One?
An XML sitemap is a structured file that lists every URL on your website you want search engines to discover and index. It follows a strict XML schema defined by the sitemaps.org protocol (supported by Google, Bing, Yahoo, and Yandex). While search engines can discover pages through links, a sitemap acts as a direct road map — ensuring new pages, deep content, and orphaned URLs don’t slip through the cracks during crawling. Sites with more than a few dozen pages, frequently updated content, rich media, or poor internal linking benefit most. Google’s own documentation states that sitemaps are “especially useful” for large sites, sites with isolated pages, and new domains with few inbound links. Without a sitemap, Google relies entirely on following links, which means newly published pages can take days or weeks to appear in search results.
XML Sitemap Structure: Tags, Fields, and Limits
A standard XML sitemap uses the <urlset> root element with the namespace
http://www.sitemaps.org/schemas/sitemap/0.9. Each URL entry is wrapped in a
<url> element containing up to four child tags: <loc>
(the absolute URL — required), <lastmod> (the date the page was
last modified in W3C Datetime format), <changefreq> (a hint about how
often the content changes), and <priority> (a value from 0.0 to 1.0
indicating the relative importance within your site). The protocol imposes two hard limits:
a single sitemap file cannot contain more than 50,000 URLs and cannot exceed
50 MB uncompressed. If your site exceeds either limit, you must split the
sitemap into multiple files and reference them from a sitemap index file.
This tool handles that automatically — when your URL count crosses the 50,000 threshold,
it generates a sitemap index with properly segmented child sitemaps.
Understanding changefreq and priority
The <changefreq> tag tells crawlers how frequently a page is likely to change.
Valid values range from always (pages that change with every access, like stock
tickers) through hourly, daily, weekly,
monthly, yearly, to never (archived URLs that
will not change again). It is important to understand that this is a hint, not a
directive — Google may crawl a “yearly” page daily if it detects frequent
changes, or skip a “daily” page for weeks if the content never actually updates.
The <priority> tag is often misunderstood. It is a relative signal
within your own site, not an absolute ranking factor. Setting every page to 1.0 is the same as
setting every page to 0.5 — the value only matters in comparison to other pages on your
domain. A sensible strategy: homepage at 1.0, key category or landing pages at 0.8, blog posts
and product pages at 0.5–0.7, and utility pages (privacy, terms) at 0.1–0.3.
Google has stated publicly that it largely ignores <priority>, but Bing and
Yandex may use it as a lightweight signal when deciding crawl order.
When to Use a Sitemap Index File
A sitemap index is a wrapper file that lists the locations of multiple sitemap files. It uses
the <sitemapindex> root element and contains one or more
<sitemap> entries, each with a <loc> pointing to a child
sitemap and an optional <lastmod>. You need a sitemap index when your site
has more than 50,000 URLs, when you want to segment sitemaps by content type (products, blog,
pages, images), or when different sections of your site update at different frequencies. Large
e-commerce sites commonly have separate sitemaps for product pages, category pages, and blog
posts — this lets search engines prioritize recrawling the most volatile segments without
processing the entire index. The sitemap index itself can contain up to 50,000 sitemap
references, giving you a theoretical maximum of 2.5 billion URLs across all sub-sitemaps.
Image Sitemaps and Rich Media Indexing
Standard sitemaps tell Google about HTML pages, but images loaded via JavaScript, lazy loading,
or CSS backgrounds may not be discoverable through normal crawling. An image sitemap extends the
standard format with the image namespace and <image:image> child
elements inside each <url> block. Each image entry requires at minimum an
<image:loc> (the absolute URL to the image file) and can optionally include
<image:caption>, <image:title>,
<image:geo_location>, and <image:license>. Google supports
up to 1,000 image entries per page URL. This is especially important for photography sites,
e-commerce product galleries, and portfolio sites where images are the primary content. Including
images in your sitemap increases their chances of appearing in Google Images results, which can
drive significant traffic for visual-heavy niches.
Hreflang in Sitemaps for International SEO
If your site serves content in multiple languages or targets different regions, you can declare
hreflang relationships directly in your sitemap using the xhtml namespace. Each
<url> entry includes one or more
<xhtml:link rel="alternate" hreflang="[lang]" href="[url]"/>
elements that tell search engines which version of the page to show users in each language or
region. The hreflang value uses ISO 639-1 language codes optionally combined with ISO 3166-1
Alpha-2 region codes (for example, en-US for American English, fr for
French regardless of region, x-default for the fallback). A critical rule: every
page that declares alternates must include itself in the list, and the relationship
must be reciprocal — page A must point to page B, and page B must point back to page A.
Missing reciprocal links are the single most common hreflang implementation error and cause
Google to ignore the annotations entirely.
Common XML Sitemap Mistakes to Avoid
The most frequent mistake is including URLs that return non-200 status codes. Every URL in your sitemap should return a 200 OK response — listing 301 redirects, 404 pages, or 500 errors wastes crawl budget and signals poor site maintenance. Other common issues include:
- Including noindexed pages — if a page has a
noindexmeta tag or HTTP header, it should not appear in the sitemap. Google will flag this as a conflict in Search Console. - Protocol mismatches — mixing
http://andhttps://URLs, or inconsistentwwwvs non-wwwusage, creates duplicate entries that confuse crawlers. - Stale lastmod dates — setting every page to today’s date defeats the purpose. Google uses lastmod to prioritize recrawling, so inaccurate dates reduce its trust in your sitemap data.
- Missing the XML declaration — while most parsers handle it gracefully,
the sitemap protocol requires
<?xml version="1.0" encoding="UTF-8"?>as the first line. - Unescaped special characters — ampersands in query strings must be
escaped as
&, and angle brackets in URLs (rare but possible) need their respective XML entities.
Looking for more developer tools? Explore all Dev & Tech tools on EvvyTools.
Frequently Asked Questions
What is the maximum size of an XML sitemap?
Per the sitemaps.org protocol, a single sitemap file can contain up to 50,000 URLs and must be no larger than 50 MB uncompressed. Larger sites should split into multiple sitemaps and reference them from a sitemap index file.
Does priority in a sitemap actually influence rankings?
No. Google ignores priority and changefreq values as ranking signals and treats them as hints at best. Their real value is relative weighting within your own site, which can inform crawl budget allocation. Focus on including only canonical, indexable URLs.
How often should I update my sitemap?
Whenever you publish, delete, or significantly update pages. Many sites regenerate sitemaps automatically on each deploy. For highly dynamic sites, a nightly rebuild is sufficient because Google rechecks the file on its own crawl schedule.
What is hreflang in a sitemap?
The xhtml:link rel=alternate hreflang attribute declares language and region variants of a page. It is part of the sitemap protocol extension documented by Google and tells search engines to show the correct localized version to each user based on their language and country.
Do I need a sitemap if my site has good internal linking?
For small sites with strong internal linking, probably not. For sites with more than a few hundred pages, orphaned content, or frequent publishing, a sitemap meaningfully accelerates indexing. Google's own documentation recommends sitemaps for large, new, or rich-media sites.