The phase of discovery and analysis of web pages does not only take place with the indexing process that we already know, but there is another decisive moment in which crawlers come into contact with what we publish: is the rendering, the operation of interpretation and graphic rendering of the page. Today we dedicate ourselves to this topic in order to understand how it works and why it is useful to the search engines.

The process of sites discovery for search engines

Generally, when we think about the ranking of pages we immediately think of indexing, says Dave Davies on the pages of SEL, and especially at the time when a search engine already has:

  • Discovered a page via sitemap or crawling.
  • Continued to visit the page for indexing.
  • Collected all the content of the page.
  • Started sorting the page for queries.

Probably, this is the most important phase of the process because these are the factors that influence the rankings, but it is not the final phase of the discovery process and, according to Davies, its weight could decrease over time while the final phase – rendering – gains way.

What is the rendering

Literally, rendering is the graphic encoding of the information on the page in HTML language, which are translated (rendered) so as to give an understandable form to the graphic elements that make up all the websites, each with its own characteristics.

To allow this process are the special rendering engines, rendering engines, which collect and digest the data packets coming from the server and transform the lines of code of the HTML language into graphical and functional elements, such as text blocks with hyperlinks, images, videos and other elements such as PDF files.

The work of rendering engines

There are essentially three phases in which the work of a rendering engine is composed:

  • Decoding of inputs – or input information.
  • Data processing.
  • Graphic representation of information.

Trying to simplify, the web rendering engine picks up the source code of the web pages requested by the user and scans all the HTML and CSS elements present; these information are then decoded to start building the web page to display.

In this process are identified the various parts of which the page is composed, such as the text block and its formatting, the background, the color, the content, the various multimedia elements and so on. Using the information in the source code of the resource and in the CSS file, the engine processes and tidies the elements and then, together with the GPU, transforms the source code into the elements actually displayed in the browser tab, finally allowing the user to access and consult correctly the desired resource.

Differences between rendering and indexing

The difference between rendering and indexing can be represented very easily with a comparison between two images, explains Davies: at the top we have the lines of HTML code of a page of our blog, while at the bottom there is the graphical representation of the same page as displayed in the browser.

 

differenza tra indexing e rendering della stessa pagina

 

Basically, it is the same content, first shown for how it looks during indexing (HTML) and then for how it is rendered in rendering (Chrome).

Why rendering is important

You might think that rendering is important only for those who have Javascript sites, but in reality this process concerns and affects all sites, as confirmed by the fact that the search engines rendered pages even before the recent boost to the use of Javascript for websites.

Essentially, the reason this process matters is that rendering provides the truth.

Through the code, a search engine can understand what a page is about and roughly what it contains. With rendering, they can understand the user experience and have much more information about which content should have priority.

During the graphical rendering phase the search engine can answer many relevant questions to properly understand a page and how it should be classified. For example, the article mentions issues such as:

  • Is the content hidden behind a click?
  • Is an ad filling the page?
  • Is the content displayed at the bottom of the code actually displayed at the top or in the navigation?
  • Is a page loading slow?

When the rendering actually happens

According to the information in our possession – and always in principle – the rendering occurs after indexing and may even take a few days or weeks, with a variable time sequence. This essentially means that search engines will understand the content and context of a page before they get a full understanding of how this should be prioritized.

This obviously does not mean that search engines are completely ignorant until rendering, because there are some solid rules and impressions that they have acquired over the years that allow them to quickly make assumptions about:

  • Which elements are present.
  • Where they are located.
  • How important they are to the user.

But it is only after the graphic rendering of the pages that the engines will know that their assumptions are correct and can fully understand a page and its shape.

Problems with rendering

In summary, search engines send a crawler to the site, which will represent the page as a browser would.

If in Bing “they seem to be a little better at rendering, which is interesting”, at home Google Googlebot has a Web Rendering Service (WRS) component, updated in May 2019, on the occasion of the evergreen update of the crawler, which now also uses the latest version of Chrome for rendering.

Until then, the WRS used version 41 of Chrome, which was exceptional for compatibility but “was a nightmare for sites that relied on modern functionality, such as those of modern Javascript“, writes Davies.

In essence, this means that “when your page is rendered by Googlebot it is rendered more or less as you would see it in your browser“. However, it is wrong to think that it is enough to check on a browser if the page works properly, because the sense of rendering is other.

If we have a “basic site with predictable HTML and almost zero dynamic contents, there is really nothing to worry about” and probably there were no worries even with the old WRS configuration.

But for sites “with dynamic contents offered via Javascript, there is a big warning and it is rooted in a gap”: until the page is displayed, the engine does not know what is on it. Unlike a site with a simple HTML output – in which the engine might lose some of the context but includes the content – with a site based on something like Javascript that is based on rendering “the engine will not know what content is on the page until the WRS has completed its work“.

And so, the “weeks” of work required to achieve this result are quite impactful, and it is also for this reason that the engines are working to reduce latency. Until then, however, Javascript developers will have to rely on pre-rendering (creating a static version of each page for engines), which is by no means the ideal solution.

How Web Rendering Service works

The rendering life cycle follows this path:

  • A page is discovered by sitemap, crawler, etc.
  • The page is added to the list of pages on the site to be scanned when the crawl budget is available.
  • The content of the page is scanned and indexed.
  • The page is added to the list of pages to render on a site when budget rendering is available.
  • The page is rendered.

The critical and unspoken element of the process is the rendering queue: “Googlebot could reach a page weeks before rendering and until then some content (Javascript sites) or context (all sites) might be missing”, explains the article. And when a page reaches the top of queue rendering, the engine will send what is called headless browser, or a browser without a graphical user interface.

Somehow, however, the search engine manages to figure out what where and how it appears on a page, even if it has no eyes to see it: when the process is completed in a positive way, “the rendered version will look like Googlebot and in the graphical browsers”, while “otherwise, the page is likely to be based on an unsupported function such as a user’s permission request”.

Conclusions: rendering is still problematic

According to Davies, at the moment “we can count on the indexing capabilities of the engines, but the rendering side still has a long way to go” to bridge the gap between what search engines see and what a user’s browser does.

Based on the author’s experience, there is a possibility that in the short-to-medium term “the latency between indexing and rendering is drastically reduced, especially on sites that are based on this process”, and this could lead to the opening of a world for sites that need graphical rendering to be understood.

 

Call to action