Googlebot will soon speak HTTP/2

Quick summary: Starting November 2020, Googlebot will start crawling some sites over HTTP/2.

Ever since mainstream browsers started supporting the next major revision of HTTP, HTTP/2 or h2 for short, web professionals asked us whether Googlebot can crawl over the upgraded, more modern version of the protocol.

Today we’re announcing that starting mid November 2020, Googlebot will support crawling over HTTP/2 for select sites.

What is HTTP/2

As we said, it’s the next major version of HTTP, the protocol the internet primarily uses for transferring data. HTTP/2 is much more robust, efficient, and faster than its predecessor, due to its architecture and the features it implements for clients (for example, your browser) and servers. If you want to read more about it, we have a long article on the HTTP/2 topic on developers.google.com.

Why we’re making this change

In general, we expect this change to make crawling more efficient in terms of server resource usage. With h2, Googlebot is able to open a single TCP connection to the server and efficiently transfer multiple files over it in parallel, instead of requiring multiple connections. The fewer connections open, the fewer resources the server and Googlebot have to spend on crawling.

How it works

In the first phase, we’ll crawl a small number of sites over h2, and we’ll ramp up gradually to more sites that may benefit from the initially supported features, like request multiplexing.

Googlebot decides which site to crawl over h2 based on whether the site supports h2, and whether the site and Googlebot would benefit from crawling over HTTP/2. If your server supports h2 and Googlebot already crawls a lot from your site, you may be already eligible for the connection upgrade, and you don’t have to do anything.

If your server still only talks HTTP/1.1, that’s also fine. There’s no explicit drawback for crawling over this protocol; crawling will remain the same, quality and quantity wise.

How to opt out

Our preliminary tests showed no issues or negative impact on indexing, but we understand that, for various reasons, you may want to opt your site out from crawling over HTTP/2. You can do that by instructing the server to respond with a 421 HTTP status code when Googlebot attempts to crawl your site over h2. If that’s not feasible at the moment, you can send a message to the Googlebot team (however, this solution is temporary).

If you have more questions about Googlebot and HTTP/2, check the questions we thought you might ask. If you can’t find your question, write to us on Twitter and in the help forums.

Posted by Jin Liang and Gary

Leave a Comment