Bulk URL indexation check on Google

Estimated reading time: 5 min
Last modified: 13.10.2020
Tags:

If you want to achieve an SEO success, your site has to be indexed. That’s the underlying assumption. If the search engines don’t know about your URLs, they can not show them in the search results. It sometimes happens that search engines for some reason do not index some pages or specific segments of your web.

You can find URLs that are not indexed using the Miner Fulltext Index Checker, which will also help you to discover why your site is not indexed.

Search engines do not allow bulk indexing. Tools such as Google Search Console show only basic statistics on the number of indexed URLs but do not provide specific URL data. And operator site: unfortunately doesn’t offer 100% reliable data.

In practice

You can use Fulltext Index Checker:

  • If you just started to work on a new project and you are working on the initial SEO audit. One of the main steps is to find out if your URLs are correctly indexed.
  • You noticed a drop or no traffic from an organic search to a specific page, and you need to find out if the site happened to fall out of the search index.
  • If you have any doubts about the status of your site indexation.
  • If there is a significant fluctuation in the indexing curve, for example, in Google Search Console. The sooner you find out which URLs have been deindexed, the easier it will be to detect the cause of it so you can fix the problem as soon as possible.

Let’s start from the very beginning.

What is a search engine index?

Search engines go through billions of different pages every day. They crawl content via links so it can be stored and organized in a database for quick retrieval. This database is called “index”. However, not all links are important to these crawlers and only some of them are indexed and used for search results. Indexing helps search engines organize content before a search so users receive super-fast responses to their queries.   

Can website pages be removed from the index?

In a short answer, yes. It can happen that a page (that has been already indexed) is suddenly removed from the search engine’s index. There can be more reasons for this: 

  • This page suddenly stopped working and it either returns a 4xx or 5xx error code to tell search engines that this old link should be removed from their crawl index.
  • The page is blocked by a robots.txt file that tells search engines to not index it.
  • A search engine has taken a manual action on your site and deindexed your page. It could be because of its outdated or non-original content.

If you want to see whether your site has been indexed by search engines, we recommend you to use URL indexability miner that also shows the reasons why your page is not in the index.

Checking if your page is indexed by search engines 

If you want to see whether your page has been indexed by search engines, you can simply use different search engines and their own tools. Here is how to do it: 

Check Seznam.cz indexing

To see if your website page has been indexed by Seznam.cz, use info: operator and a page’s URL that you want to check. 

For example, info:https://www.marketingminer.com/en

Can you see your link in the search results? Then it’s all ok as your page is indexed and you don’t have to do anything. 

If your page doesn’t appear in the search results, then it’s not indexed and Seznam.cz will be unable to display it. 

Do you want it to be indexed? Here is how to change it:

Go to the Seznam webmaster tool, add the page’s URL and click the “Přidat” button to index it. 

Add website to Seznam

If SeznamBot finds your page useful then it will crawl and index the page as well. As a result, users will finally be able to see your page in the search results. 

Check Google indexing

Previously, the info: operator on Google would give users information about websites and additional links to the site too. It was the fastest way to see if the page has been indexed or not. However, the whole section of such search results was removed back in March 2019 and Google no longer supports this search operator. For this reason, Google search operators are not the fastest and 100% accurate way of checking your indexed pages.

Checking indexed pages in Google Search Console 

If you want to check whether specific pages are indexed or not, use the URL inspection tool, a diagnostic tool offered by Google Search Console. 

To see the page’s current index status, you need to enter the URL you would like Google to index into the search bar. It’s important to note that before using GSC, you need to verify your site ownership too. After that, you will be able to see detailed information about your indexing issues: 

URL Inspection tool Google Search Console

Just a reminder: In Google Search Console, there is a daily limit of inspection requests for each website you have access to. 

So what if you need to check thousands of pages to see if they are all indexed? This is when you use Marketing Miner’s Fulltext Index Checker.

How to check your index status in bulk

After logging into your Marketing Miner account, click the “Create Report” button at the right top corner. As you want to check your pages index status, select URL as an input. 

Then you need to enter a list of all URLs you want to check the index status for. 

Bulk Index checker insert data

Then click on the flag to select the country for which you want to get the data and finish by clicking on Next Step.

Selection of the Miner and data collection

In the Miner selection section, click on Miner Fulltext Index Checker. This Miner inquires to search engines about given URLs by search operator info:. This way the Miner checks if given URL is indexed and if a search engine returns the same URL as the one that was presented in the input.

Update: Google doesn’t support operator info: . So we developed our own way how to check if URL is indexed or not!

Fulltext Index Checker miner

Nothing else needs to be set. Click on Get Data to start processing your inputs. Once the report is complete, the processed data will be sent to you by email.

Output example

Column description

  • Input: URLs, of which indexing was checked.
  • Indexed by Google: Detection, whether URL is indexed by the search engine. It returns either yes(indexed) or canonicalized or not indexed(because Google doesn’t support info: operator anymore, we can’t be sure if the URL is not indexed or just canonicalized to another URL).
  • URL in results: Information about what URL, entered by the operator info: was returned by the search engine.
  • Same as input: Comparison, whether URL at the output is the same as the one at the input. It can be useful for canonicalization functionality identification as well.

Output analysis

Check of non-indexed pages

Regarding output, you should be primarily interested in the column Indexed by Google that indicates if given URL is indexed in given search engine (TRUE/FALSE indication). The correct procedure is filtering out the list of non-indexed pages and using these to find out why they aren’t shown in the search engine index and how to fix the situation.

Canonicalization check

In individual cases, there could be a different URL at the output, by using operator link:, then the one at the input. This is a sign of the fact that the search engines know about given URL, but they use its canonized URL in search results. To detect these URLs, see the column Same as input, which returns either TRUE, in the case given URL at the output is the same as the one at an input, or FALSE, if it is not.

Was this article helpful?
Dislike

Continue reading

Previous: Plagiarism online checker
Next: Keyword search volume
Have questions? Search our knowledgebase.