February 02, 2011

Google, Bing, and "Darth Toolbar"

Blog Update (February 3, 2011): Bing Stealing Google Results? Or Users Giving Them Away? Does the Difference Matter?

Greetings. Recently, in My Take on "Google Accuses Bing of 'Stealing' Google Search Results", I suggested that the root of this escalated arguing between Google and Microsoft fundamentally relates to the broadening scope of data collected by common "toolbars" -- that are either pre-installed on various systems, or that are installed by users (sometimes voluntarily, sometimes inadvertently as part of other software installations).

Toolbars can be extremely valuable tools for users, but in some key ways various commonly used toolbars have become a case study of "mission creep" -- and along the way have crawled further and further towards the dark side.

Early search engine toolbars were generally focused on enhancing the direct interactions between users and the search service that provided the specific toolbar in question. But over time, toolbar capabilities have in some cases expanded to include collecting data on users' interactions with other sites, both in terms of which sites users visit, and sometimes even the input that users may enter on those other sites.

It can be argued -- and it is true -- that such data (for example, Web browsing history) could be employed by any given search engine to enhance the user experience in various ways. But it's also true that these sorts of data can be extremely valuable signals for an array of competitive purposes.

Now to be clear, not all toolbars engage in these practices, nor default to the same level of information gathering. Legitimate search toolbars are typically very careful to provide user controls over these functions, installation and usage disclosures, and so on. Nor do all search engines use the data collected from toolbars in the same ways. Similar signals used by one firm as major input to their search results algorithms may have less or no significance on search results generated by another service.

In practice, of course, most people don't read the disclosures and might not understand the full significance of what was being disclosed even if they did bother to read them. And most users will tend to stick with default settings.

It was the shift toward toolbars collecting data on user behaviors beyond the confines of their interactions with the associated specific toolbar providers that has led to the current Google vs. Bing accusations.

Search terms being entered by users on Google were (and apparently are) reportedly being collected and distributed to Bing by Microsoft-provided toolbar mechanisms in certain configurations. Microsoft asserts that participation in this is completely voluntary and disclosed to users -- and technically that appears to be true.

But let's face it. Users are pushed pretty hard to accept toolbar installations along with other packages, and as I mentioned above most people aren't particularly interested in plowing through the disclosures and option settings. So it's likely that many Bing users were unaware in practice that their Google usage data was being uploaded to Microsoft.

In a way, this is sort of similar to what would happen if you automatically created a list of your Google (or Bing, or whatever) searches every day, and posted it to a public Web page, where it could be discovered and indexed by any search engines that happened along.

[Update (6:00 PM): To be more precise, a better analogy is to posting your queries including the resulting links. But in the context of this discussion we're still talking primarily about a "discovery" mechanism for new search terms plus those related links. This appears to be significantly different than would be the case if, for example, a firm was routinely scanning and copying a competitor's results en masse for already known terms in an effort to improve their own already existing results for those terms.]

Search engines would likely use any unusual or otherwise unknown terms and links in those searches (as present on this hypothetical user's "searches" Web page) for additional page and link discovery input, which then could (depending on the specific algorithms in use) find their way into those services' main search databases, where other users could find them. As you can see, the end result is quite similar to what's reportedly occurred in the current Google/Bing controversy, though enabled by toolbar activities in that case, not by public posting of searches.

So where does this all leave us as relates to the current saga?

I don't quote biblical text very often (to say the least!) but Matthew 26:51-52 seems somewhat appropriate in this case, "Put your sword in its place, for all who take the sword will perish by the sword."

Personally, I don't like what Microsoft is doing by collecting Google search inputs and using them as signal data to Bing search results algorithms. It strikes me as underhanded and unethical, and I doubt very much that most users are cognizant that it is occurring. I'll bet that at least some users would definitely be uncomfortable with such activities. I'd like to see such behavior by Bing stopped.

But ultimately, the entire search industry has been extending the bounds of toolbar data collection for years now, and it was probably inevitable that one or more players would push the envelope into the rather nasty area where Bing now resides in this respect. It has been tolerance of this gradual creeping toward the dark side of data collection by toolbars that set the stage for where we are now, and there's blame to go around for all of us on that score.

Toolbars can be extraordinary useful, or they can be abused. Or both. Like any tool, they can be abused, and even when such abuse may not actually be illegal, it can still be ethically bankrupt.

Microsoft should cease "poaching" Google search queries from users, on ethical grounds if nothing else.

But more broadly, we all should be giving some deep thought to our roles in allowing some toolbars generally to evolve toward becoming the hungry data Morlocks of Web technology, rather than evolving as the strictly useful tools -- with clear and reasonable demarcations of data collection and use -- that most of them started out to be in the first place.


Blog Update (February 3, 2011): Bing Stealing Google Results? Or Users Giving Them Away? Does the Difference Matter?

Posted by Lauren at February 2, 2011 12:25 PM | Permalink
Twitter: @laurenweinstein
Google+: Lauren Weinstein