October 22, 2011

Google Modifies SSL Behavior -- and the Results Are Troubling

A few days ago in another venue, I noted that Google has moved toward default SSL search for its logged-in users. I have long been an advocate for broader use of SSL encryption, as a means to protect users' data from third-party observation and manipulation, so I applauded this move by Google.

In that posting, I noted that there were some issues related to Web "referers" (that's the correct spelling in this context!) when used in SSL environments.

This isn't the time or place to get into a philosophical discussion of referers and whether or not data contained therein (such as user search queries) really should or should not have become standard operating procedure on the Web. As I've discussed previously, I am not in the "referers are always evil" camp, and I view the issues surrounding referers as being complex.

I've now had time to further study the referer/SSL situation with Google's new default SSL environment -- and to discuss the situation with Google -- and I'm frankly troubled by some of what I've found. In particular, certain changes that Google has apparently made in the normally expected SSL handling of referers may be viewed by some observers as at least creating the appearance of a "pay to play" push toward Google advertising.

This gets rather complicated quickly, and I'm basing the description that follows on the best information I have at this point. If I learn more later of note I will update as appropriate.

Please refer to my chart as we proceed (chart is also below).

One of the basic rules of SSL is that data is supposed to be protected from end-to-end. In particular, when a SSL-based query is made of a search engine, the expected behavior is that referer query data will not be passed along to sites clicked in the results unless those sites are also running SSL (subject to default browser and plugins settings, which would normally pass through this data).

This is indeed the reported behavior on Google's original encrypted search site -- branded as "SSL Beta" -- at https://encrypted.google.com. Users have been able to manually use this site for quite some time, by specifying it as a direct URL (either https:, or http: which diverts to https:). As you can see on the chart, referer query data from https://encrypted.google.com is passed to both "Ordinary" and "Ad" sites if those sites are https: (SSL), otherwise the referer query info is blocked.

However, Google has apparently altered this expected sequence for their new https://www.google.com -- which can be reached either by direct URL, or via automatic redirect from http: as described above for logged-in Google users.

There are two variations reported from expected SSL behavior on this version of the site, as denoted by the red boxes on the chart.

Normally, we would expect an ordinary destination site using SSL to receive the referer query data as per standard SSL end-to-end behavior. But apparently Google is now blocking this data in this case, as shown in the first red box.

Even more problematically, in the second red box we observe that for user clicks on Google ads, the ad site will receive the referer query info from the SSL search, even if that ad site is not using https: -- that is, isn't even using SSL at all -- seemingly directly violating the normally expected end-to-end SSL protection sequence.

Again, I am not anti-referer. I appreciate the value of referer query data for legitimate Search Engine Optimization (SEO), for ordinary Web site operators, and of course for advertisers.

Google notes a number of mitigating factors. They feel that they attempted to strike a balance between security and the needs of advertisers, and point out that only a small percentage (logged-in users, less than 10% of total searches currently) of queries are diverted to https://www.google.com. They also note that some search query data (though clearly not of the same depth as raw referer query data) is available through their (quite excellent, I will note) Webmaster Tools.

Still, these are quantitative, not qualitative limitations, and use of https://www.google.com can only be expected to expand.

So long as Google stayed within the normal, expected operating parameters of SSL in relation to referer queries, as with https://encrypted.google.com, they were on very firm ground.

But by changing the expected SSL behavior for https://www.google.com, they have created the appearance of a situation where -- as far as referer query data is concerned in this context -- it could be construed as necessary to buy Google ads to either obtain query data when you otherwise wouldn't be able to obtain it, or to keep receiving it in a situation where you ordinarily would have expected to receive it anyway.

My recommendation is fairly simple. In the optimum case, https://www.google.com should behave in the same manner that https://encrypted.google.com does now. As an alternative to ease the situation for advertisers, they could keep receiving query data from https://www.google.com -- even over http: links -- for a limited period to provide transition time for them to move to full https: SSL. This would also serve the laudable goal of further encouraging the adoption of SSL. At the end of the designated transition period, if ad sites were still not on https:, they would no longer receive referer query data. And needless to say, any site that runs https: -- whether they buy ads or not -- should be able to receive referer query data if ad sites can receive that data.

Google is clearly aware of the controversy surrounding their SSL situation. And one of Google's great strengths is their robust internal deliberation process and willingness to change course as appropriate.

It will be interesting to see how this saga transpires.

--Lauren--


Posted by Lauren at October 22, 2011 08:06 PM | Permalink
Twitter: @laurenweinstein
Google+: Lauren Weinstein