Content of the article

/01 What are the reasons for the absence of the site in the index and what to do to eliminate them
/02 To summarize

Why is my website not indexed?

To begin with, you need to check whether your site or its individual pages are represented in the Google index. Most often, there is no problem, just that the page does not rank high enough in the search results.

Effective way to check:

Turn off safe search, or go to the anonymous browser tab
Find the page you are interested in using the site:your-domain search operator – you can read more about search operators here

Examples: site:example.com or site:example.com/zootovary

If you managed to find the required page, it means that it is in the index

What are the reasons for the absence of the site in the index and what to do to eliminate them

As for the options, why Google does not index the site as a whole, you need to carefully study each individual case, and find the source of this phenomenon as soon as possible.

The site is closed from indexing in Robots.txt

One of the most common mistakes webmasters make is incorrectly compiled robots. The presence of blocking Disallow directives in it can prevent the bot from processing the entire resource or its individual web pages. Here it is important to remember that every website has pages that need to be closed from indexing. These include technical search results, get-parameters, login, admin, shopping carts, “trash”, etc. Read more about configuring the Robots.txt file in our article «How to make a robots.txt file?» Ways to check the correctness of the Robots.txt file setting:

Add the ending /robots.txt to your domain name and check it yourself
Open to check the corresponding robot a tool from Google. But first you need to add the site to Google Search Console
Scan the site with Screaming Frog Seo Spider or Netpeak Spider. In this program, you will go to pages that are closed for indexing, correct errors in robots.txt and check the correctness of the functioning of the corrected file.
Use the search engine and find another service convenient for you to check the correctness of the J file setting

Robots meta tags

Sometimes the pages of the site can be closed using a tag that is set in the code of the pages. It looks like this <meta name=“robots” content=“noindex”> – prohibition of site content scanning. There are many directives, but let’s stop at this one. So, when this tag is installed on the page between the <head></head> tags, when Google scans the site, it does not add to the index the pages that have the robots tag with the noindex directive, so you need to check this moment when you notice that the site pages are not included in index. How to check Robots meta tags:

A tool for webmasters

In Google Search Console, open the Index – Coverage – Excluded tab

Using Screaming Frog Seo Spider or Netpeak Spider
Open the code of the page manually (F12), and by pressing the key combination Ctrl+F find meta name=”robots”

Prohibitions in the .htaccess file

The .htaccess file contains the rules of the web server. This file is usually placed in the root directory of the web server or the root directory of the site. With the help of some .htaccess rules, you can close the site from indexing. Check the .htaccess file on your server to see if it might contain any indexing rules.

Incorrectly configured or missing rel=”canonical” tag

The rel=”canonical” tag is used on pages with the same content. This tag tells search engines the address of the page that is the main page. Consider, for example, two pages that have the same content:

https://mybestua/original-page/ – main page;
https://mybestua/dublicat-page/ – a page with duplicate content.

To have the main page in the index, you can apply the rel = canonical tag. In the html code of the page https://mysupersite.com/dublicat-page/ between the <head></head> tags, you need to add the following tag: <link rel=”canonical” href=”https://mysupersite.com/original- page/” /> If your page or pages are not indexed, check the presence of the rel=”canonical” tag in the html code and its correctness.

Prohibition against X-assisted indexing is enabled‑Robots-Tag

X‑Robots-Tag is a directive in server response headers that can be used to prevent robots from indexing specific pages or files. Example of an HTTP response from a server with an X-Robots-Tag directive that prohibits page indexing: HTTP/1.1 200 OK Date: Tue, 11 May 2022 22:34:11 GMT (…) X-Robots-Tag: noindex (…)

Other reasons

Google Search Console does not list all domain variants (http:// and https://)
The site is sanctioned by the search engine
The site is not adapted for mobile devices
Technical SEO issues
Low response speed of the website server
The site is too young, or the domain has a bad history

To summarize

For a website to be successful, it is important to put good content on it, work with technical SEO and create quality backlinks. But all this will be useless if it is not indexed. So, make sure all indexing issues are resolved and then Google will thank you with good traffic.