May 01, 2007

Benefits and Risks in Google's Public Records Access Project


I'd like to expand a bit on the issue of the shifting legal landscape for search engines. This story discusses how Google is working with states and other governmental entities to improve public records database access. This could provide some real benefits by enabling typical Google high-quality access to this important and useful data, but could also easily open up a new Pandora's Box of major privacy and related risks.

To the extent that such databases become more easily searchable and integrated with Google's core database, there may unfortunately be a qualitative change in the potential for bad actors to take advantage of the system.

For example, right now, a search for "David J. Farber" on Google will of course find lots of material, mostly related to this person's professional writings and activities. Integration into Google of major public records databases means that much more potentially intrusive and abusable information -- real estate records are full of this stuff -- would be as easily found, either when targeting individuals or searching for broader classes of "targets" based on particular criteria (age, health, address, etc.)

True, the data is coming from state, federal, or local databases which already contain the information, but the very awkwardness of accessing these systems -- in comparison to the ease of using Google -- creates something of a crude firewall against at least some casual and largescale abuses, without depending solely on the quality of "sensitive information" redactions by the source agencies.

I believe that it would be extremely useful for Google to consider the implementation of additional privacy protocols particularly aimed at lowering this risk potential with public record data. A hands-off approach (treating all data equally) is unlikely to provide long-term legal protection to Google or other search engines, since it seems increasingly probable that courts will ultimately find that entities who organize data in ways that lead (however unintentionally) to abuses may share responsibility and liability for those abuses along with the data source providers.

I've seen significant public-records data nightmares even with the existing crude database access systems. Individuals and organizations can be and are hurt by abuse of such data. As I noted above, a high-quality Google interface to this data will bring both broad global benefits and a wide range of serious new risks on a much larger scale, that could have major public policy and privacy ramifications in a number of key areas.

It appears inevitable that courts and Congress will at some point start clamping down on "enabling technologies" -- like search engines -- which judges and legislators will view as being directly involved in copyright and other data access related abuses, since these services act as "middlemen" by organizing the data in ways that so vastly increases the ease of access.

I doubt that the sort of "safe harbor" provisions as in the DMCA today will last indefinitely without significant modifications that would likely mean new liabilities for Google and others, unless proactive steps are taken by these firms to try balance out some of these benefits/risks factors. I do believe that such proactive steps are technically and reasonably possible.

To not take such corrective actions risks draconian legislation and court decisions that could have drastic negative impacts on search engines' income and operations, and dramatically reduce the usefulness of these services to the world's users.


Posted by Lauren at May 1, 2007 10:51 AM | Permalink
Twitter: @laurenweinstein
Google+: Lauren Weinstein