April 07, 2009

Privacy vs. Crawling: Digging Deeper into "Newspapers vs. Google"

Greetings. Lots of strong reactions arrived in response to yesterday's AP Declares War on Google and Others, But the Collateral Damage Will Be Ours.

These covered quite a range, but among the most interesting, were:

-- The assertion that Google and Yahoo ignore "robots.txt" (this is clearly an inaccurate claim when robots.txt is used properly)

-- Several declarations amounting to "It's a violation of a site's privacy if a search engine crawls it without explicit permission -- (My response: such material should not be exposed in non-password protected areas of publicly accessible Web sites in the first place.)

and finally a whole bunch of:

-- Please explain more about the battles between newspapers and Google -- it's still too confusing.

I feel your pain. When a system is under stress the way that all manner of media and content are today in the Internet world, simple explanations can only take us so far.

Here's one example. We know (see yesterday's piece for details) that there's a great deal of animosity being directed toward news aggregators (like Google News) by newspapers, media moguls, and other organizations.

But the battle lines aren't always clear. Google News not only indexes newspaper sites and the AP articles on those sites, but also displays full text (for limited periods of time, as far as I know) the full text of many AP items. Does Google do the latter without explicit permission from AP?

Of course not. In fact, the agreement that permits this apparently started back in 2006.

So what's all the yelling about? There are a couple of factors in play. One is that except for locally generated content, Internet access to newspaper sites renders much of the materials on these sites largely fungible -- that is, interchangeable. A national or international AP item on newspaper A's site will ordinarily be exactly the same as, or very similar to, that same item over at newspaper B's site -- both of which are equally accessible over the Net.

This (among other issues, such as wire service subscription costs) have triggered increasing friction between wire services such as Associated Press and their client newspapers. And from a local newspaper's standpoint, an AP article that can be viewed standalone on Google News is one that a potential reader no longer needs to view at a newspaper site, where they'd be exposed to other materials that the site offers.

Since AP itself makes this scenario possible, something of a love/hate relationship becomes inevitable, and the result is not only considerable confusion, but a lot of angry finger pointing as well.

I suspect we can all agree that if a Web site displays significantly large or complete versions of copyrighted materials from newspapers (whether wire service related or not) without explicit permission, it's likely a copyright violation.

But attempting to marginalize the validity of "fair use" -- as some participants in this debate are apparently doing -- is also wrong. Likewise wrong-headed are various other arguments being made against the propriety of search engine indexing, certain forms of "deep linking," and assorted other Internet-specific methodologies.

History tells us that these sorts of technology-based disruptions are nothing new, and in fact are to be expected. What's more, the initial, knee-jerk reactions to these situations are often proven to be highly counterproductive and in retrospect sometimes just plain silly.

Television, we were told, would destroy both radio and the motion picture industry. Free commercial television broadcasting, the warnings came, would be rendered a mere memory by the rise of "Pay TV" services (I still remember anti-pay-TV ads with a drawing of an old TV set attached to big hoses, leading to a coin collection box). Now we're blasted with "The Internet is destroying the newspaper industry."

To be sure, none of these media forms have stayed unchanged in the face of technological advancement and competitive alterations, but they have all survived, largely because they've continued to bring value to their readers, listeners, and viewers -- a structural dynamic that can persist so long as we will it -- with our eyes, ears, and wallets -- to do so.

It would be exceedingly helpful if instead of "mad as hell" pronouncements from the newspaper industry, we instead worked toward a mutually beneficial future where access to information would continue to expand, rather than be artificially constrained on the heels of protectionist rhetoric.

--Lauren--

Posted by Lauren at April 7, 2009 12:21 PM | Permalink
Twitter: @laurenweinstein
Google+: Lauren Weinstein