Page indexing issues, 15 causes blocking our presence on Google

SEO admin 17 January 2022 Reading time : 7 minutes

Put us to the test!

Analyze your site

Select the database

My page does not appear among Google SERPs, or even the entire site does not appear in the search results. This is a situation that unites many webmasters and marketers, who often do not know where to put their hands to try to solve a problem that can cause obvious negative consequences and damage to the business. There may indeed be many potential indexing problems, errors or complications that may prevent Google from indexing Web pages properly, and only knowing them (or at least knowing the main and most frequent) you can learn the solutions to implement to get back to visibility on the search engine.

Indexing problems, 15 frequent causes to know and solve

Before we start to discover all possible indexing issues, it is good to take a step back and remember some crucial concepts to understand how the discovery and inclusion work in the Google Search Index, the large list that contains hundreds of billions of web pages, the size of which exceeds 100,000,000 gigabytes.

Creating a website and publishing online content does not mean that automatically and instantly all its pages appear among the search results: web crawlers, in fact, implement a first selection of all the URLs they visit and decide which to send to the index, in fact. This is the meaning (in extreme synthesis) of indexing, the technical activity that precedes the ranking and that simply indicates that a page has been taken into account, analyzed and stored by Google.

We have various tools and techniques available to check if a site is indexed, especially through the tools of the Google Search Console, and last year Google tested in the United States the Report an Indexing Issue feature that just allows you to report indexing problems – while two alternative search engines, Bing and Yandex, have launched the IndexNow system to manually submit a page to be indexed.

Despite these steps forward, however, as we said it can often happen not to find a page (or an entire site) in the Search, and in particular there are 15 indexing problems that can hinder the success of our project, also highlighted by Brian Harnish in Search Engine Journal.

Errors and issues that prevent indexing

A first aspect not to be overlooked is that the indexing times on Google are not immediate and may take days or even weeks before the search engine adds a resource to the list: so, before assuming that there is a problem, it would be better to wait at least one week from the sending of a Sitemap or the indexing request, and always check again after one week if any pages are still missing.

One possible reason why Google does not index a site is the absence of a domain name, which may depend on whether we are using the wrong URL for content or from an incorrect setting on WordPress.

If this is what is happening, there are some easy solutions: first, we can check whether the web address starts with “https://XXX.XXX…” – which means that someone could type an IP address instead of a domain name and be redirected to the site – and then check that the IP address redirection is configured correctly.

One way to solve this problem is to add a redirect 301 from WWW versions of pages to their respective domains and, basically, make sure you have a domain name.

A similar problem occurs if the site is indexed with a different domain or with a subdomain – for example, with http://example.com instead of http://www.example.com.

To prevent the insertion of pages in Google are also problems of content quality, which are the main cause of indexing failure according to a study carried out not too long ago on sites of different sizes: We know that well-written content is key to success on Google, and so if we offer low-quality pages that don’t even reach the levels of competition it’s hard to think that crawlers will take them into account.

It is not word count, because content of 300 words can not be indexed but also those with a thousand words, but thin content (poor content that easily encounters indexing problems because it is not unique and does not meet the minimum standards of quality compared to competition) and the usual concepts of quality and utility: that is, our pages must be good and informative, must answer questions of the user (implicit or explicit), provide information or have a view sufficiently different from others in the same niche.

Not even Google likes a site that is not user-friendly

Having a user-friendly and engaging site is crucial for good SEO, and consequently a site that is not easy to use and does not involve visitors (or, worse, provides a navigation system articulated in complex hierarchies of connection that creates frustration or exasperation) is an element that can cause problems of indexing.

Google doesn’t want users to spend too much time on a page that takes an eternity to load, has a confusing navigation or is simply difficult to use because there are too many distractions (such as ads above the fold or interstials).

This is particularly true for people who use mobile devices, an area in which Google has introduced the mobile-first Index for six years now and where simple rules apply: no matter how beautiful the content is, if the user who uses a smartphone or tablet cannot view it. Mobile optimization is based on the addition of responsive design principles, and components such as fluid grids and CSS Media Queries can do a lot to ensure that users find what they need without encountering browsing problems.

Especially in the last year, after the introduction of the Page Experience among the ranking factors, loading time is an element that can lead to exclusion from the Google Index and there may be several problems that affect the time it takes to load pages. For example, there may be too much content on the page that complicates the management by a user’s browser, or we use an obsolete server with limited resources: anyway, what matters is to ensure a fast loading.

Technical issues that can hinder the inclusion in the Index

We now come to some practical examples of technical problems that can prevent pages and site to be analyzed correctly by Googlebot for inclusion in the Index.

We talk about choices such as the use of a programming language that is too complex, both old and modern as Javascript, that has incorrect settings and causes problems of scanning and indexing.

More specifically, the use of Javascript to display content could result in negative situations: it is not a problem with this language itself, but rather its application with techniques that may resemble cloaking or otherwise appear shady. For example, if we have rendered HTML and raw HTML, and a link in this raw HTML that is not present in the rendered one, Google may not scan or index that link; so, as Harnish says, “don’t hide your JS and CSS files even if you like to do so”, because “Google claimed to want to see all your JS and CSS files while scanning”.

The same difficulty to see the page in the SERP we find if we use plugins that prevent Googlebot from scanning the site: the US expert mentions the robots.txt plugin, which can be automatically set to noindex for the whole site, making it virtually impossible to crawl Googlebot.

Obviously, even the robots.txt file itself can be a critical element and you should follow the best practices for the robots.txt file to try to avoid or limit errors, thinking carefully about which parts of the site we want to avoid scanning and then use disallow accordingly on these unimportant sections. Basically, a good SEO technical strategy can prevent this kind of indexing errors, as well as help pages to have good parameters in Core Web Vitals and other aspects that may affect Google’s ability to analyze pages and deem them worth of its Index.

Other aspects that can impact on page indexing

The management of technical SEO also allows you to avoid falling back into situations that can generate problems to the proper functioning of the site, such as incorrect settings of meta tag robots (such as unintentional and unwanted settings on noindex, nofollow) or redirect loops.

The redirection chains, in particular, can also result from typos in the drafting of the URL, which create a duplicate address that points to itself; to identify and resolve such cases, in WordPress we can find the .htaccess file and search for the list of redirects, verifying that everything is up to standard (and possibly setting redirects 302 in 301).

It is also important to submit a sitemap to Google, which is perhaps the best method for the search engine to discover the pages of the site and to increase the chances of each page being scanned and indexed correctly. Without a sitemap, Googlebot will run into our pages randomly and blindly, unless they are already indexed and receive traffic; moreover, it is not enough to send the map only once (especially for dynamic sites) but you have to periodically update and send the file for scanning and indexing important pages and new content.

A last element that can determine the non-indexing of the pages of the site is to be found in the history of the domain itself and, specifically, the presence of previous and incorrect manual actions. Google has repeatedly stated that sanctions can haunt us and if we do not properly perform the reconsideration process to clean up the site is highly likely that even new resources do not find space in the Index. This also applies to recently purchased domains, which may have a dark history of Google penalties behind them – which is why it is essential to first check the “criminal record” of the site before the investment, because then it can take precious time for Google to understand that there is a new property that has cut ties with the past.

What are the 15 reasons for indexing problems on Google

Visually summarizing them before the conclusion, then, the 15 potential causes of indexing problems on Google are:

Waiting time
Absence of domain name
Indexing with a different domain
Poor content
Poor user experience
A site that is not mobile-friendly
Slow loading pages
Complex programming languages
Javascript used improperly
Plugins blocking Googlebot
Blocks in the robots.txt file
Settings in meta tag robots
Redirect chains
Missing sitemap
Domain sanctioned with unresolved manual actions

We understand, then, that there are many elements to assess if we find the absence of our pages from Google Search, a real trouble that risks to frustrate all SEO efforts because, In fact, it takes away our visibility and the opportunity to reach the public.

And so, in addition to rightly dedicating time to the care of content, technical SEO and link management (essential components to allow the site and its pages to achieve the quality and authoritativeness necessary to compete on the search engine)However, we must not overlook the attention to indexing, the first step of our race to the first page.