How to get Google to index website pages faster — best ways to improve website indexing by search engines
If you have a quality web page, which you spent a lot of time on creating, but it is not in the index of search engines - it simply will not get traffic and will not reach your goal. To avoid this, you need to take care of fast indexing of your site by search engines such as Google. In this article, I've compiled a list of recommendations on how to do this.
XML Sitemap
Set up the generation of XML Sitemap, a file containing a formatted list of your site's pages. Sitemap XML helps search engine bots find and crawl your pages faster.
The file contains the last update date, priority, and recommended page crawl interval. Also, different map formats can contain additional information, such as a list of images from the page.
The Sitemap is usually referenced in the robots.txt file. It is also useful to send a link to it to webmaster panels such as Google Search Console and Bing Webmaster.
In Google Search Console, go to Indexing > Sitemaps
and add URL addresses.
The maximum size of Sitemap should not exceed more than 50000 links. If the site is large - you can either make page-by-page navigation and give a few files, or split the maps by sections of the site: main, articles, products, categories, news, etc.
HTML Sitemap
Sitemap HTML is not a separate file like XML, but a complete web page that contains information about the structure of the site.
It is usually created in HTML format and represents a list of links to important pages and sections of the site. If you have an online store with thousands of products, you don't need to display them all on this page. Often sitemap.html is a simplified version of sitemap.xml, but unlike it has value not only for the search engine, but also for visitors.
It represents a kind of "hub" - a centralized and structured list of available pages, which helps visitors to easily navigate the site and quickly find the information they need, but it also has a positive effect on the indexing by search engines.
Robots.txt
This file gives recommendations to search engines which pages and sections of the site to index and which not to index.
With the help of instructions in robots.txt you need to disable the indexing of unnecessary pages - like search, payment, personal account, shopping cart.
It is also necessary to remove technical duplicates of pages, for example, when due to the peculiarities of the CMS site pages are available both by human-readable addresses and by page IDs, for example:
https://site.com/site-page
https://site.com/index.php?id=2
Google looks at both pages and identifies them as duplicates. One of the pages will show up in search, but after a while Google may change its mind and start showing the other webpage. This will negatively affect the SEO of those pages.
As I wrote above, in robots.txt you specify the addresses of the sitemaps.
Duplicate pages
When a search engine indexes 2 identical or very similar pages within the same site - it tries to figure out which one should be indexed and which one should be ignored. One of the pages will be removed from the index, and not always the one we want to remove.
To avoid this, you need to take care of the correct settings and content of the site. Duplicates can be:
Technical - generation of duplicate pages due to the CMS, as I wrote above. Using MODX Revolution as an example - at default settings pages are available by alias and by ID in parameters.
To remove technical doubles, you need to:
- configure redirects
- close unnecessary pages in robots.txt
- customize link url generation in the system
Semantic - when similar pages are created, answering the same search queries. To avoid this, you need to normally collect the semantic core. For example, you have created 2 sections - «how to make google index the site faster» and «how to make the indexing of the site more effective». Logically and according to search engines it should be one page.
To eliminate semantic duplicates, it is necessary to find out which page gets more traffic and occupies the best positions in search. the duplicate should be "glued" to a more effective page - move important content and configure a redirect.
Content quality
Create valuable, relevant and unique content that engages visitors and encourages them to stay on the site longer.
Good metrics for behavior increase the chance of indexing and improve search engine rankings. In addition, expert content can help build your link profile.
Structure and DFI
The parameter of page nesting level plays an important role. In this case, it can be divided into 2 components:
- URL nesting - how many subdirectories are in the webpage address
- click nesting - how many clicks from the main page you need to make to get to the page
The closer the researched page is to the main page in terms of clicks - the more priority it is given to it by search engine bots.
Therefore, on the main page of the site often display new products, articles, and so on — the search engine robot views the home page much more often than other pages.
Load speed
A search robot may not wait for the page to load If the server response is long and not index it. It also reduces the speed of search robot crawling of the site.
Slow loading negatively affects user behavior and is negatively perceived by search engines, so you need to work on your loading speed.
Optimize images, minimize CSS and JavaScript files, enable browser caching, and use content delivery networks (CDNs) to speed up page load times.
Internal linking
Internal links help search engine bots discover new content and distribute link weight across the site.
Try to do linking not only for search engines, but also for users - good linking affects behavioral factors, the user stays on the site longer. This increases the probability of conversion - click on an advertisement, order a product or service, and so on.
Backlinks
Try to get quality external links from authoritative sites. Backlinks serve as a trust signal for search engines, signaling that your site is popular on the Internet and should be indexed more actively. Even nofollow links, which do not transfer weight, have a positive impact on indexing.
Focus on natural and relevant backlinks rather than low-quality or spammy ones. Linkbuilding is a broad topic as there are many different tools for building link mass.
Social media
Promote your site on social media. To some extent, this relates to the previous topic of backlinks. In most social networks you get not even a nofollow, but a php or javascript redirect link, but don't get frustrated - a search engine bot will also go through it and get to your site.
Crowling budget
Crawling budget is the number of pages of a website that a search engine robot (crawler) can scan in some time. Accordingly, opening each unnecessary URL is a decrease in the probability of indexing an important URL. Redirects, broken links, links to other resources or non-indexed pages also use up the crawling budget.
Let's outline the important things you can do to save crowding budget.
Unnecessary redirects
It often happens, for example, that copywriters put links without a slash or with a slash at the end. For the user, of course, there is no difference, because the CMS will most likely redirect him to the correct page.
But each redirect will consume the budget of the crawler. Scan the site, find 301 and 302 redirects and fix them.
Broken links
Similar situation with the 404 error page - broken links on the site do not need anyone, so along with fixing redirects, take care of fixing broken links.
Nofollow, noreferrer, noopener
Set up links to other sites and non-indexed links with rel="nofollow noreferrer noopener" attributes.
Internal links to non-target pages, such as authorization page, shopping cart, payment and so on, can be marked with rel="nofollow". Such a link will not be transferred link weight.
For links to external sites use a set of attributes rel="nofollow noreferrer noopener" - this way we save link weight and provide security from phishing attacks.
Trash pages
Try to make sure that each indexed page is useful for visitors, contains enough content and answers any search queries. Otherwise, a large number of garbage or empty pages on the site will take bot's time to crawl, taking attention away from important pages.
Periodically check for crawl errors in the Google and Bing webmaster consoles. Quickly fix any problems, such as broken links or inaccessible pages, so that search engines can index the site faster.
Reindexing
Webmaster panels allow you to notify search engines when a new page is published on a site or old content is updated .
In Google Search Console, enter the URL address of the page into the search input field, wait for it to finish loading and click «request indexing».
The submit for reindexing method is extremely effective for quickly adding a page to the index - it usually takes a few minutes or hours.
A list of these actions will help speed up the indexing of your site and pages will be in Google's index within a few hours.