August 22, 2006

Search Engine Privacy Dilemmas -- and Paths Toward Solutions

Greetings. A very recent New York Times story neatly encapsulates the overall state of search engine query data retention issues.

The observant reader will note that despite the rising tide of concerns regarding search query privacy, the industry as a whole is still pretty much in a state of denial, made all the more confusing by various signals from the U.S. Department of Justice.

This is turning into such a mess that it's becoming difficult to even keep the various participants and their positions completely clear. There is every reason to believe that without heroic action by the players involved, we may be heading toward a privacy, legislative, and judicial nightmare. But maybe there's a way out.

Let's review:

AOL's release of search query data made obvious to everyone what many of us knew all along -- that such data contains all manner of personal information, even when the identity of the party making the query is not immediately known directly from usage logs. In the AOL case, the individual query entries were linked by "anonymized" user IDs, but even without such linkages the query items alone can be highly privacy-invasive. The AOL release triggered (as did DOJ vs. Google) broad calls for mandated search query data destruction policies.

The personal nature of the AOL query data serves nicely to liquidate the DOJ's arguments (again, as in DOJ vs. Google) that such data is not privacy-invasive so long as the query source is unidentified. The expressed DOJ reasoning is this regard is obviously faulty.

Search engine companies have been reluctant to voluntarily dispose of query data on a regular basis. This data has considerable R&D, marketing, and other value. Since the incremental cost of keeping all queries archived forever is so low, there is little incentive within the normal business structure to dispose of this resource, absent overriding considerations.

Even while laudably expressing concerns about the potential for third-party misuse of query data, search engine firms (e.g. Google) have proclaimed their intention to keep collecting and saving this data indefinitely. If AOL actually sets in place an aggressive data destruction schedule, it will be something of a watershed event that may (or may not) have broad impacts across the search engine industry. Fears of being placed at a competitive disadvantage will tend to make unilateral moves toward query data destruction difficult to propose or implement.

Meanwhile, DOJ is moving in exactly the opposite direction, apparently preparing to propose long-term (perhaps measured in years) mandated data retention schedules, requiring the saving of the very data for which destruction demands are being made in other quarters. DOJ is using child abuse (and as of late anti-terrorism efforts) as their hooks to justify such legislation (please see this entry for more).

This situation has all the elements of a painful and wasteful deadlock, potentially triggering years of litigation while the overall search engine issues continue to fester and become even bigger privacy, business, and political problems.

If we wish to avoid this scenario -- or at least have a good shot of avoiding it -- we need to act now, and we need to do so cooperatively. There are policy and technological approaches to the search query dilemma that can be applied in ways that will serve the interests of all stakeholders. Cooperation and compromise mean that nobody is likely to get everything that they'd ideally want, but to paraphrase the great philosopher Mick Jagger, perhaps we can all get much of what we need.

Therefore, I propose the formation of a high-level Internet working group/consortium dedicated specifically to the cooperative discussion of these issues and the formulation of possible policy and technology constructs that can be applied toward their amelioration. Such a working group would be as open as possible, though proprietary concerns would likely necessitate some closed aspects if progress is to be accelerated as much as possible.

Participation by all stakeholders would be invited. Representatives of the major search engine firms and concerned government agencies, outside technologists and other persons involved in privacy and search issues, and other entities as appropriate, would all play important roles.

Of course, it's easy -- especially for large corporate enterprises -- to simply ignore such efforts and just plow ahead independently. Obviously, without the participation of the key players, the effort that I'm proposing would be useless, and I will not continue to promote it if that situation ensues.

However, I suggest that it will be in the long-term best interests, both financially and in terms of corporate and organizational responsibility, for major stakeholders to actively join such a project, since the alternative seems ever more likely to be somewhere between highly disruptive and extremely draconian.

Interested? Please let me know. All responses will be treated as confidential unless the sender indicates otherwise.

Thank you for your consideration.

--Lauren--

Posted by Lauren at August 22, 2006 08:54 AM | Permalink
Twitter: @laurenweinstein
Google+: Lauren Weinstein