What is a sitemap and three ways to create a sitemap.xml file

What is a sitemap and three ways to create a sitemap.xml file
What is a sitemap and three ways to create a sitemap.xml file

Before proceeding directly to the content of the article, it is necessary to understand what the term “Site Map” means, because in the general sense it has two different meanings. 

The first is a list of all the pages of the site, which is usually placed on the site page near the footer, or has a separate page. This map is used to facilitate user interaction with the site and refers to internal navigation.

Site map | WEDEX

The second is a special file that also contains information about the structure of your site, but is intended for search engine robots.

Site map | WEDEX

And this site map will be discussed further in our article.

What is a sitemap?

A sitemap or sitemap is a file that tells search engines which pages and files should be considered important on a site and provides valuable information about them. For example, when the page was last updated, the presence of images or videos, and a list of all alternative language versions of the page.

Main types of Sitemap file

There are three main sitemap file formats:

  1. Sitemap file in XML format is the most universal and popular sitemap format. In addition to complete information about each URL, it can be expanded with additional data, and most content management systems (CMS) generate it automatically. But for large sites, the formation of these files can be quite a difficult task. Also, for sites without a content management system, it will need to be created from scratch.        
  2. RSS, mRSS and Atom 1.0 are similar in structure to XML, but contain less additional data about your files. The main advantage is that most CMS generate them automatically.  
  3. A sitemap text file is an ordinary file with the extension TXT, which can only contain URLs of HTML pages. It can only be created manually, and Google receives information about the text content that needs to be indexed.

Each of these formats has its own advantages and disadvantages, but it makes no difference to Google which one you use on your site. Therefore, you should choose the option that is most suitable for your site and depending on the flexibility of its settings.

Do all sites have a sitemap?

By default, the sitemap file is not present on the site, but if the web resource has a content management system, you can generate a simple map using plugins. For example, for WordPress, the Yoast plugin will generate a file index that will contain maps for all available page types on the site.

Site map | WEDEX

If there is no CMS on the new site, then you should add the map file yourself or with the help of technical specialists. 

How to check if a site has a sitemap

By default, the Sitemap file is located in the root directory of your site. And the easiest way is to enter “https://yourdomain.com/sitemap.xml” in the search bar.

For example, let’s take the samsung.com site:

Site map | WEDEX

The file is not always called sitemap.xml. For example, the name could be sitemap_index.xml. Usually, in such cases, a redirect to the map page or index file should be configured. If the redirect doesn’t happen and you see an error page, then either the redirect is not configured and then you need to know the correct file name, or the sitemap is missing from the site.

Another way to find out if a site has a Sitemap is to check for a link to the sitemap in the file robots.txt.

To do this, enter “https://yourdomain.com/robots.txt” in the search bar and check whether there is a corresponding directive:

Site map | WEDEX

The robots.txt file can contain links to several sitemaps at once.

Is a sitemap important for SEO?

The sitemap.xml file itself is not a Google ranking factor, that is, it has no direct impact on the ranking of your resource in search. Thus, even without a sitemap, your resource will be indexed, and search engines will scan your content without problems.

But with the help of Sitemap, you can draw Google’s attention to the most priority pages or to entire sections of your site, which should be included in the search results in the first place. This will allow you not to waste your crawling budget and speed up the process of indexing your content. For SEO optimization site, this means that we save time and resources on attracting users from organic traffic. The best map option for search engine promotion is an XML sitemap file. With its help, you can specify the most information about our pages, as well as add information about important images or videos that also affect the ranking of the page in search.    

When sitemap.xml is needed

If you have a large site. The crawling budget is limited, which means that it simply won’t be enough for all the pages of your site if Googlebot crawls everything in a row. Also, if a large amount of content is constantly added to the site, it is not a fact that all new pages have links and search engines may simply not notice them.

If you have a new site. Usually, newly created resources do not have external links and search engine crawlers cannot get to the pages of your site, which significantly slows down the process indexation.

The site has a lot of multimedia content. If your resource contains a lot of videos and images, you can use sitemap.xml to provide information about them to Google. We indicate to the search engine that this content is important on our site and we get a better chance of it appearing in search results.

You have a news site. The indexing speed of new pages is critical for a news site. If your resource is presented in Google news, then the site map provides information about page updates and helps to find current news faster in the search.

What is included in sitemap.xml

The syntax of the sitemap.xml file consists of attributes:

Mandatory:

<urlset> – encapsulates the file and refers to the protocol standard;

<url> – parent tag for each URL record;

<loc> – URL of the page. The value is limited to 2048 characters.

Optional:

<lastmod> – the date of the last page update;

<changefreq> – how often the page is scheduled to be updated;

<priority> – the priority for crawling the page relative to other URLs on your site. Must be in the range from 0.0 to 1.0.

Google’s documentation states that currently the <priority> and <changefreq> attributes are ignored by crawlers, and the <lastmod> attribute is only taken into account if the update time in the attribute matches the actual time when changes were made to the page .

An example of the syntax of a sitemap.xml file containing one page:

Site map | WEDEX

Basic requirements for the sitemap.xml file

  • xml file must match the Sitemap protocol;
  • tag values ​​must be escaped;
  • the file must be in UTF-8 encoding;
  • the file must be located at the root of the site;
  • the file size is limited to 50MB, and the number of URLs must not exceed 50,000 pages;
  • only absolute URLs (https://yourdomain.com/category/) should be specified, not attributes (/category/);
  • only canonical URLs should be specified in the file.

Site map index

In the event that there are too many pages and one file cannot comply with the restrictions, or you want to specify only specific sections and files that are a priority for you, then separate Sitemap files are created instead of one map. That is, we divide the file into separate smaller parts. In this case, our sitemap.xml will be a sitemap index file that will contain links to all individual sitemaps.

An example of a site index map, if we have a catalog of automotive products with two divisions: tires and wheels, and we want to separate them. Then we create a sitemap_index.xml file for all pages not in the catalog (main, delivery, etc.), we design the pages from the tire section as sitemap_pokrishki.xml, and with disks – sitemap_diski.xml and add these files to the root folder. We create a file named sitemap.xml with the following content:

Sitemap index | WEDEX

Sitemap for images

For images, you can add appropriate tags to an existing sitemap or create a separate map. This is useful when it is necessary that an image, and not only a page, for example, a photo of your product, gets into the search quickly. Or when search engines cannot access them directly because the content is hosted via JavaScript or other technical solutions.

Example sitemap syntax for images:

Sitemap for images | WEDEX

Video sitemap

Similar principle as for image card. Has a larger number of attributes, but at the same time, a larger number of restrictions, including accessibility without the need to log in to an account, accessibility to indexing in the robots.txt file, etc.

An example of the syntax of a sitemap for a video:

Sitemap for videos | WEDEX

Site map for news

The sitemap.xml file can also be extended with special tags for news articles to speed up their search, but in order to better track the statistics of your content in Google News in Google Search Console it is recommended to create separate sitemap files for this type of content.

Syntax example for news:

Sitemap for news | WEDEX

Data about localized versions of pages

Another advantage of using a Sitemap in XML format is the ability to add information about all language versions and countries for which content is created using the hreflang attribute using the <xhtml:link rel=”alternate” child element.

Syntax example for a Sitemap containing information about foreign language versions of pages:

Site map | WEDEX

3 main ways to create a sitemap.xml file for your site

Method 1: Independent creation of a site map

This method can be suitable only if the number of pages on the site will not change. Otherwise, if new pages are added, they may not be included in the index, and you will have to add them manually each time. And you will have to constantly add new lines to sitemap.xml. Therefore, in most cases, we do not use this method. Also, it is absolutely not suitable for large online stores.

To make a simple sitemap with our own hands, we need to have 2 things:

  • a list of all site pages;
  • site map template prepared in Google Sheets (download).

A list of site pages can be obtained using site download programs. There are several of them at the moment. These can be:

  • NetPeakSpider (paid, trial period);
  • Xenu’s (free);
  • WebSite Auditor (paid, there is a free version for a limited number of addresses);
  • ScreamingFrog (paid) etc.

After receiving the list of addresses of the site pages, we open the template for the Sitemap or download it as an Excel table.

Creating a sitemap | WEDEX

Next, we copy the list of page addresses into column B. We highlight all the filled cells in column A and extend to the end of column A by the little black square in the lower left.

Creating a sitemap | WEDEX

We do the same with columns C and D.

What should come out:

Creating a sitemap | WEDEX

Using the program “Notepad” or similar software, we create an empty text file called sitemap, and change the extension to “.xml” and open it.

Creating a sitemap | WEDEX

Insert the following lines at the beginning of the file:

<?xml version=”1.0″ encoding=”UTF-8″?>

<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>

Creating a sitemap | WEDEX

Next, we completely copy column D from the template Excel file.

Creating a sitemap | WEDEX

And insert it into our sitemap file.

Creating a sitemap | WEDEX

Next, we insert the last line:

</urlset>

Then we save the received file and add it to the root of the site, where the robots.txt file is located.

You can do this using the WinSCP or TotalComander program by simply dragging the file into the window with the root folder open.

Creating a sitemap | WEDEX

Method 2: Automatic generation using services and programs

This method differs from the first only in that you do not need to download site addresses yourself and generate a sitemap file. The program will do it for you. But the disadvantage of this is that it:

  • will not work well if robots.txt is not properly formed;
  • you constantly need to manually update the site map or add urls of new pages that have been added to the site since the last generation of the file;
  • if the number of pages on the site is greater than the limit, then the functionality may be paid.

This method is suitable for sites with a small structure that does not change for a long time.

One of the most popular services for creating a site map is MySitemapGenerator. In this service, you can generate a file of 500 pages for free, indicating the priorities of the pages and the date of the last updates.

Go to the site and choose a tariff plan:

Creating a sitemap | WEDEX

Next, insert the address of our resource and click “Start”.

Creating a sitemap | WEDEX

We wait while the site analysis process takes place – while the pages are crawled by the crawler. After that, download the finished file.

When saving, set the name “sitemap.xml” and save the file.

Creating a sitemap | WEDEX

After that, we copy the file to the root folder of the site (described in method 1).

There are actually a lot of such programs. And which one to use is up to you. All of them have a similar work logic, so it makes no sense to write about each one separately.

In addition to online sitemap generators, there are also PC crawlers. For example, ScreamingFrog has a sitemap generator feature.

Creating a sitemap | WEDEX

But the principle of its operation is the same as that of online services.

Method 3: Auto-generated site map

This is the most optimal option, in our opinion. Similar site maps are created either by the built-in functionality of the admin, or by programmers with the help of php files that generate it “on the fly”. If you have one of the common CMS such as: WordPress, Opencart, then they either have a built-in functionality, or you can install plugins, and if properly configured, they will generate quite good maps.

But it is better not to do this work, but to write a TK for a programmer. It should have approximately the following content:

  1. Create an auto-generated site map with generation in the sitemap.xml file, which will be located in the root folder.
  2. Configure adding to it only canonical pages and files that are not blocked from indexing, using the meta robots “noindex” tag and the robots.txt file.
  3. Configure the “lastmod” tag, which indicates the date the page was last modified. The page update data is taken from the content management system (CMS).
  4. Set up site regeneration in case of adding/deleting pages, but at least once a month.

You can also configure the generation of a sitemap file for any access to it. But if the number of pages is large, it can significantly load the hosting, which will lead to the fact that the site will be unavailable for a certain time. 

How to send a sitemap.xml file to Google

In order for the search engine to get data about your sitemap faster, you can perform two main actions:

Adding a Sitemap to your robots.txt file

In the robots.txt file, which should be located at the root of the site at the very end, you need to write a directive with a link to your map or several Sitemap files.

Creating a sitemap | WEDEX

Adding to Google Search Console

If you have verified your resource in the Google Search Console tool, you have the option to submit sitemaps and/or index maps for processing by the search engine.

To do this, go to the “Sitemap Files” tab:

Creating a sitemap | WEDEX

And we add URLs of all sitemaps for processing:

Creating a sitemap | WEDEX

If the file has no problems and passes validation, you will see the status “Success” and the number of pages the file contains.

To summarize

If you plan to perform high-quality SEO optimization of your site for a good search ranking, then we recommend setting up the automatic generation of Sitemap files in XML format. In this way, the search engine will always be aware of all updates and changes in the number of pages and their content. This is especially relevant for search engines with incremental updates of the issue.

Serhii Ivanchenko
CEO
Pavlo Vlasiuk
CMO
commercial offer

    SEO promotionCopywritingSMM promotionDevelopmentContextual advertisingDesign
    Digital новини в нашому телеграм-каналі
    Інтернет-маркетинг
    простою мовою
    subscribe
    Other articles by the author
    25/09/2023
    A good website is an important part of an online presence in today's digital world. The process of creating a website requires detailed planning and a careful approach to all stages. What is a website, how is it created, and what are the stages of its creation?

    05/10/2023
    You may be wondering why we don't use the standard web interface to create ads on Google. We use both. There are things that are more convenient to do through the browser version, and there are things that are more convenient to do through this program.

    14/05/2025
    The social network TikTok officially resumed advertising tools for the first group of customers in Ukraine on April 23, 2025.

    Latest articles by #SEO
    14/05/2025
    The social network TikTok officially resumed advertising tools for the first group of customers in Ukraine on April 23, 2025.

    22/04/2025
    Instagram advertising has become a powerful tool for businesses looking to increase their online presence.

    21/03/2025
    Meta Ad Library is a free tool that contains a large number of existing advertising creatives for Facebook, Instagram, and WhatsApp.

    WhatsApp Telegram Viber Почати розмову