Internet Guide Logo

GoogleBot

Last Edit: 10/01/17

GoogleBot, also referred to as the Google robot, is the name given by Google to their web crawler. A web crawler is a software program, classified as a 'robot', that automates the process of collecting data about web content. GoogleBot is used by Google to collect the information that is used in it's search engine. GoogleBot begins with a core set of URLs, and it then follows each hyperlink that it finds in a webpage; within time, GoogleBot can crawl billions of webpages, and can build an extensive database of information that can be queried by it's visitors to google.com.

In 2004, GoogleBot was crawling in the region of three billion webpages; it's search results (database) were generally updated on a monthly basis. However, by 2014, Google crawls far more webpages, and it updates it's search results in a more fluid manner. GoogleBot can be identified by webmasters - by checking their access logs - from it's user-agent:

As of 2014, the latest version of GoogleBot is version 2.1; the current version of GoogleBot is always provided by it's user-agent. GoogleBot can be banned from accessing a website: a webmaster needs to create a robots.txt file, and then write a 'nofollow' command for the GoogleBot user agent. The IP address used by GoogleBot is changed periodically; Google state they change the IP address of GoogleBot to combat spammers.

GoogleBot generally follows SRC and HREF hyperlinks; although GoogleBot is becoming increasingly sophisticated and can crawl content generated by AJAX and JavaScript. If a webmaster wants GoogleBot to crawl it's content: then it needs to get a hyperlink embedded in a webpage that is already crawled by GoogleBot, or, by submitting a request to Google - via it's submit URL form, or webmaster tools.