What is a crawl budget

Last modified: 13.10.2020
Estimated reading time: 1 min

A Crawl Budget is a volume of URLs that a crawler can crawl in a certain period of time. Its amount is most commonly stated in the quantity of URLs scanned by a specific robot (crawler) in one day. It depends on several factors and the most crucial ones are content quality, website load speed and internal linking structure.

How to identify a web crawl budget

User needs access log file to identify crawl budget from a specific crawlers (robots) of a search engine. Access log is a file on a server that records all the requests that are processed by a particular server. The output data then return as:

  • User-agent (It is used to identify requests from a crawler to server)
  • IP
  • URL of a request
  • Date and time of a request
  • …and many others

It enables SEO specialists or anyone else to closely analyze information about requests performed by search engine crawlers.

Crawl budget definition

Crawl budget optimization

The search engines assign to website a crawl budget primarily based on its authority (link portfolio) and a volume of unique and quality content that they are able to obtain. In the matter of web crawler step, a crawl waste has to be considered. Crawl wastes are the comments and queries that go on non-existent websites or the ones we don’t want to crawl. Following are the most common problems that occur in log analysis:

  • URL with error response
  • Non-indexable websites
  • Website with “thin content”
Was this article helpful?

Continue reading

Previous: What is SERP visibility
Next: What is a link
Have questions? Search our knowledgebase.