Greetings. Word is that the U.S. FTC (Federal Trade Commission) is seriously toying with the concept of establishing some sort of Internet "do-not-track" list to ostensibly control Web ads that involve behavioral targeting and/or user tracking.
Outside of the fact that it's not entirely clear that the FTC has this authority per se, and in the maximal case would only have purview over U.S. sites (the Internet is international the last time I checked), the concept of trying to create such a list strikes me basically as undesirable and impractical -- for both policy and technical reasons.
I'll have much more to say about this later, but as a starting point let's consider these issues:
- How effective has the phone solicitation "do-not-call" list really been? In my experience and based on anecdotal evidence -- despite claims from some quarters of vastly reduced solicitations -- the reality is that the calls just keep on comin'. And the phone-based list deals with the comparatively simple target of phone numbers, not the complexities of Web sites.
- How do we actually define "tracking" and "ad targeting" in a rapidly evolving Internet environment? This is actually a very complicated matter.
- Are ads that are generally less targeted and more "scattershot" a net plus or minus for consumers?
- If ads lose significant value, what are the ramifications for the largely "free services" model that most Internet users have come to expect on the Web?
- How could a broad inter-site list of this sort be implemented without creating unacceptable privacy and security challenges carrying the potential for unintended negative consequences?
And so on.
My sense is that the concept of an Internet "do-not-track" list of the type under discussion represents largely the same sort of mostly (though not entirely) political posturing that was behind the telephone-based "do not call" concept, and that the practical issues and problems with such a plan for the Internet are vast.
At this juncture it might also be useful to mention again an important paper I first noted some months ago -- Opt-In Dystopias -- which explores in depth how seemingly obvious issues of "opt-in" vs. "opt-out" in reality can be far more complex and subtle than they might appear to be initially. This paper should be required reading for anyone interested in or involved with these issues.
More to come.
"ICANN said the DNSSEC would eventually allow Internet users to know 'with certainty' that they have been directed to the Web site they sought. 'This upgrade will help disrupt the plans of criminals around the world who hope to exploit this crucial part of the Internet infrastructure to steal from unsuspecting people,' ICANN President and CEO Rod Beckstrom said in a statement."
Greetings. While the implementation of DNSSEC is certainly important, and the avoidance of DNS cache poisoning attacks is clearly very useful, ICANN's "Dragnet-esque" pronouncements about fighting crime strike me as highly ironic.
The simple fact is that "Internet criminals" have a vast array of tools in their arsenal to misdirect users, and few of these depend on cache poisoning or DNS manipulation.
Much of the crime is enabled by the fundamental design of the domain name registry/registrars ecosystem, which enables crooks to easily create and abandon completely valid "disposable" domains that are only used for short periods of time and cannot be reasonable tracked to their owners.
In fact, through their plans to unleash vast numbers of new Top Level Domains (TLDs) on the Internet -- perhaps hundreds in the first year -- ICANN will only be increasing the confusion of consumers and providing fresh juice for criminal operations. Most Internet users aren't calling for new TLDs -- they mainly think in terms of dot-com and that's unlikely to change any time soon. The main push for new TLDs is from would-be registry operators and their registrar cohorts,
So as far as I'm concerned, ICANN isn't winning the "Joe Friday" crime-fighter award any time soon.
Greetings. Without any conscious effort to change my patterns of applications usage, I've noticed of late that I'm using Google Buzz much more and Twitter considerably less -- this despite the fact that I'm (for now, anyway) following fewer people (and have far fewer followers) through Buzz as opposed to Twitter.
A bit of reflection reveals why Buzz seems increasingly useful, despite the perceived smaller user base.
In a word: Quality. This applies across a number of vectors.
First, and perhaps most obvious -- the 140 character Twitter message limit, supposedly related to SMS text message considerations in Twitter's original design, represents an increasing frustration. Perhaps there is good karma in learning how to become a better headline writer -- a skill that Twitter certainly tends to foster. However, getting beyond the fun of shouting headlines, and instead having some sort of intelligent discussion is extremely difficult within Twitter constraints.
By not imposing message length limits, Buzz avoids this class of problems. And its threaded structure, ability to directly display linked materials, granular privacy controls, and other features contribute to a far more "intelligent" user experience overall, capable of supporting genuine discussions in depth.
Buzz has been viewed by the public largely as a direct competitor to Twitter, but in reality they are significantly different types of applications. I would place Google Buzz somewhere between Twitter and a full-blown discussion forum message system -- but without the user interface baggage that often unnecessarily accompanies the latter -- and with a much cleaner e-mail notification system than either Twitter or most forums can currently offer.
The Google Buzz launch was famously embroiled in controversy over its initial default privacy settings related to contacts discovery (apparently the result of insufficient external testing pre-launch, in contrast to Google's usually robust external testing regimes). The related Buzz defaults were indeed somewhat problematic.
However, that being said, the possibility of associated potential problems related to those defaults were blown way out of proportion by some observers and the media at large, and to its credit Google moved within hours to alter the default behaviors in manners that completely mitigated any realistic concerns, however minor.
Unfortunately, many persons may have been scared off by the exaggerated reports of Buzz problems, and haven't yet come back to take a second look.
But it would be well worth their time to do so, especially in light of the continuing series of incremental fine tuning, new features, and other aspects of the evolving Buzz service that really do provide Buzz with far more usefulness overall than Twitter for day-to-day use.
It might be argued that the learning curve for Google Buzz is a bit steeper than for Twitter, but this is to be expected given Buzz's power and flexibility compared with Twitter. The intrinsic relationship between Buzz and Google Profiles -- judging from some e-mail queries that I receive -- may confuse some users initially, but this really should not be a significant ongoing problem for most persons, since Profiles are easy to create and can contain essentially as much -- or as little -- information as you wish.
No morals or dramatic wrap-ups today. Just a suggestion. If you've never used Google Buzz, give it a try. If you tried Buzz early on and stopped due to concerns about the launch or other issues, consider trying it again now.
My profile and current Buzz activity is open for public access.
Hope to see you there.
Greetings. The New York Times has published an important article on the subject of the Net's long memory, and the impacts on reputations and other aspects of people's lives when previously posted materials exist essentially forever online.
Regular readers know (all too well!) my frequent comment that we must always assume that anything publicly posted on the Net may be permanent, despite attempts to expunge or ignore any particular data later on.
The Times article makes a number of good points, but also in my opinion downplays some key realities and confuses some important aspects of the issues in some cases.
The concept of passing laws to prevent employers from using publicly posted information (e.g. Facebook pages) in their employment decisions seems effectively meaningless. One way or another, those searches will be done and those pages seen, and if necessary some other reason will be cited for the decline of particular applicants.
The idea of self-destructing data implicitly assumes that all public copies of the public data in question are also deleted, and that all involved entities (in various countries) would even respond to data deletion requests or demands. While deleting data from the most widely seen repositories might reduce its impact for a time, the odds are that it could still be found on other servers and would eventually find its way back into primary search engines again at some point as Net site crawling proceeds.
The article seems to confuse the concept of deleting public data with (their example) Google's anonymizing of log data after a specified period of time. The former is explicitly public data, the latter explicitly private without copies in public view. The two cases are entirely dissimilar. That log data can be anonymized says nothing about the ability to effectively delete publicly available data.
It is disturbing that people are paying significant -- sometimes very large -- sums of money to third parties in an attempt to "game" Google search results to push "negative" links toward the back of results listings. I do continue to feel that some mechanism to help in situations of egregiously false information would be useful. Several years ago I discussed this -- in terms of concepts to think about, not a specific proposal -- in Extending Google Blacklists for Dispute Resolutions.
Decades ago, when the Internet (then ARPANET) was new, I started participating in and helping to develop the then novel concept of public discussion mailing lists, many of which were archived in one form or another even way back then, and the discussions from which in many cases remain online today. I still receive new comments responding to what I wrote on those lists so long ago that new generations of Internet users have discovered.
Yet I remember being conscious even at the time of the likely "permanence" of what I was writing. I distinctly recall saying to a colleague who inquired about this "new-fangled" mailing list stuff that it was useful and fun, but that he'd better assume that anything he sent to those lists might be around forever. He sort of chuckled at my suggestion.
And therein may be a key to these dilemmas. On one hand, individuals need to understand -- from a very young age indeed -- that (just like how you don't want to stick your hand into an open flame) what you post publicly (or even just to your "friends") may well be permanent, and that discretion is indeed the better part of valor in some situations on the Net.
This doesn't directly help in those cases where someone else posts damaging false information -- some sort of dispute resolution mechanisms as mentioned above may ultimately have some role to play in that respect.
But fundamentally, nobody puts a gun to your head and forces you to post personal goodies to Facebook or anywhere else. Peer pressure has always existed and has ruined many lives over time, but as adults the ultimate responsibility has to be our own, not just for ourselves but also for our children who are too young to understand the potentially lifetime ramifications of what they do and say online.
Greetings. Internet "cloud"-based services, both for data storage and as computing resources, are expanding rapidly, and have become a flash point of controversy among some persons in the computer science and privacy fraternities.
On various discussion lists and forums, dialogues about the value and risks of "cloud computing" have devolved into name-calling and impassioned arguments about whether the term "cloud computing" itself is somehow misleading -- with suggestions that data storage services (where encryption is more easily applied by users) should be considered separately from remote computing services -- sometimes called "SaaS" (Software as a Service).
I'm more interested in issues than word wars, so for now (despite the related complaints that I'll receive) I will continue to refer to this entire area as "cloud computing" -- "the cloud" for short.
Some other time we can have a technical discussion of cloud computing's benefits and risks. But there are a couple of truths about the cloud that are in my opinion undeniable, and are too often lost amidst the forest of technical details.
Realize this: The future of computing and communications will increasingly be Internet cloud-based. There is no escaping this truth. The complexity of the services that will be demanded by persons around the world will increasingly be impractical to provide wholly through traditional locally-based resources.
Despite ever more encompassing attempts at automatic software updating regimes, many or most users' computers are in states of relatively poor (or even awful) security, and sport feeble or non-existent data backups, putting immense amounts of personal and business data at risk on users' local disks at any given time.
And to expect non-technical users to somehow manage these ever more complicated computing devices, even with the help of increasingly complex updating environments, is becoming about as nonsensical as requiring that everyone be their own auto mechanic.
That there are privacy and security challenges in the cloud is undeniable -- but research in these areas is proceeding rapidly and holds great promise. Laws that in some cases treat cloud-based user data as having fewer legal privacy protections than locally-based data are no longer tolerable and need to be harmonized so that user data gets the highest practicable level of legal privacy safeguards regardless of where that data resides at any given time.
But for some who dislike the cloud, no amount of technical and legal assurances will ever suffice, simply because they have a fundamental distrust of remote services -- "We never really know what's going on in the cloud!" they say.
And yet, do we really know everything going on in our local computers, even those of us who have spent our professional lives building these technologies?
In most cases, the answer is no. Unless we've written every line of code ourselves, or have compiled every program personally from source code that we've inspected (and presumably understood!) line by line, there is a leap of faith involved in everything we do on these machines.
For that matter, if you're of a conspiratorial bent, do you really know for sure what's going on in those CPU cores that run your computer? Have you inspected every line of microcode? Are you positive that something nefarious isn't going on deep within those busy chips??
More realistically, Ken Thompson -- co-creator of the UNIX Operating System itself -- noted in his 1984 paper Reflections on Trusting Trust, that you can't necessarily even depend on the compilers that you use being free of self-compiling malware and other subterfuge.
What this all boils down to in the end is -- to paraphrase Bob Dylan -- You Gotta Trust Somebody.
And in our modern world, you have to trust lots of somebodies at various levels or our entire technological civilization would simply grind to a halt.
We certainly depend on trust in our personal lives. Even though that trust may turn out to be misplaced in particular instances, this doesn't change the fact that trust is fundamental to getting virtually anything done in our modern world.
And trust isn't only a concept for individuals. Just as we trust our friends and lovers -- whose inner thoughts we can never truly know for sure -- we need to make decisions about trust related to technology as well.
The fact that we can't know everything about every aspect of cloud computing services is ultimately just another nuance of the same sort of necessarily incomplete information with which we make every other trust decision in our lives.
Ultimately, if you trust that a provider of cloud computing services is of good ethical standing, will defend your privacy rights against unreasonable intrusions, and provides services with a degree of security and reliability that you consider to be acceptable -- especially in contrast to what you can and do provide locally on your own machines, then an inability to personally inspect every aspect of operations in the cloud should not be an automatic deterrent to its use.
Technical and standards advances are making the cloud even more attractive. For example, Open Source cloud standards and efforts such as Google's Data Liberation Front provide increasing levels of transparency and data portability.
There are many factors to take into account when choosing cloud services -- just as there are in the process of making bosom buddies. There are no absolute guarantees -- there always risks in life, both today and tomorrow. But the various aspects of trust are key in both cases, and trust is possible without total knowledge of and control over the other parties involved.
Like love, trust makes the world go 'round.
Greetings. As I've mentioned previously, I tend to receive several hundred e-mails daily that relate to Google one way or another, many of which contain requests for advice regarding perceived or real Google-related issues. I try to help when I can.
And my concerns about what I consider to be significant shortcomings in Google's user communications structure -- especially when dealing with relatively unusual or serious problems -- are fairly well known.
But recent calls for regulatory oversight of Google Search are way off-base, and -- beyond the obvious First Amendment concerns -- threaten to undermine Google's efforts to provide the best possible natural (organic, algorithmic) search results via Google's continuing work to avoid distortions in or gaming of those results.
The fact that Google permits highly controversial search results to maintain algorithmically determined high rankings, even when it would be much easier from a public relations standpoint for Google to suppress those results, are another indication of Google's laudable efforts not to disrupt natural results rankings with manual alterations.
It's notable that in all of the countless cases where people have come to me (sometimes utterly convinced that Google has a vendetta against them) with complaints about their Web site "vanishing" from Google listings or not achieving the kind of result rankings that they felt were deserved, I've never once seen a case where any unfair or unreasonable actions by Google were actually in play in the situation.
In fact, in virtually all of these cases the problems have boiled down to one of two issues.
The first is that the sites have become contaminated with malware, often without the site owners' knowledge. This can result in Google quite reasonably flagging the sites as potentially dangerous to users, with resulting undesirable (but completely appropriate under the circumstances) effects. Even when site owners protest that their sites are clean, on closer inspection it turns out (in my experience) that their sites have been compromised in some manner.
An even more common case is that sites have not been organized in ways that make it possible for Google to effectively crawl their contents, or the sites include elements (often at the urging of unscrupulous SEO -- "Search Engine Optimization" -- firms), that violate the site guidelines Google has established to help avoid gaming of results to the detriment of Google's users overall.
There isn't anything intrinsically evil about SEO per se. In fact -- and here come those "secrets" in plain view that I promised -- Google has put major efforts into making available absolutely comprehensive resources and tools for webmasters, yet it appears that many or most Web site owners don't even realize that these exist.
Google's Webmaster Central is a universe of information, tools, video tutorials, and all manner of other resources that webmasters can use to better understand how Google crawls their sites, potential problems in sites; mechanisms to inform Google how sites are organized to enable efficient and complete crawling of text, video, and other formats; ways to gather metrics on how people discover sites; and so much more.
Over on YouTube, the Google Webmaster Central Channel contains hundreds of videos on related topics that should be of interest to anyone running a Web site, including many Q&A videos from Matt Cutts who heads up Google's "Webspam" team -- which is directly involved in these sorts of search quality issues (he's also a good guy -- I recommend paying attention to his suggestions!)
All of this isn't to assert that every problem anyone may have with Google will be solved via these Google resources -- nor to say that effective means to solving every other possible sort of Google-related problem necessarily even currently exists.
But for many common situations -- the kind where people may feel that Google is unfairly ranking their site -- or similar scenarios -- I believe that a reasoned analysis of the circumstances, especially in conjunction with the Google Webmaster Resources discussed above, will demonstrate that Google bends over backwards not only to keep their natural search results rankings as useful and honest as possible, but also that Google has worked very hard to explain how to optimize sites for best results -- and has provided tools to help make this as straightforward and painless as possible for webmasters.
Greetings. A few weeks ago in Why Web Video Captioning Is So Important, I mused on the importance of captioning to Web videos, and emphasized why YouTube users should take advantage of various YouTube captioning tools (automated and manual) to create the best possible experience for their viewers.
What I didn't discuss then was a more basic issue -- how the ways that videos are captioned (or dubbed) can fundamentally alter how they are interpreted, even to the extent of completely changing intended meanings.
This was brought home to me very recently when I watched a broadcast copy of a film I had not viewed for many years, the delightful 1966 movie King of Hearts (Le Roi de Cœur). [Trailer]
Set near the end of World War I, the characters in the three involved armies each speak in their native languages. This creates a complicated dubbing/captioning scenario, since typically any audience would want at least two of the three languages translated.
Language translations are less than an exact science of course, especially when idiomatic phrases are involved. (That said, automated translation techniques, such as Google Translate, have become extremely useful indeed, and will only get better with time.)
When I first saw King of Hearts decades ago it was in a fully-dubbed form without subtitles, and used fake accents to try indicate which language the characters were speaking at any given moment. This led to specific plot elements that never quite made sense to me at the time.
Captioned versions of videos and films have -- in my opinion -- generally done a better job, though timing requirements in manually-captioned cases can sometimes result in "text simplifications" that might leave out words or entire phrases.
In the YouTube context, captions also open the ability to perform automated translations based on the captions themselves -- obviously of immense benefit.
But here's an interesting question -- what happens when captions (or dubbing) are used to fundamentally alter dialogue in a film, perhaps as a form of subtle censorship or worse?
Both dubbing and captions carry this risk, but the risk with dubbing seems far higher, since the underlying original dialogue tracks will not be heard for comparison by native speakers of the language.
Automated captions will virtually always be trustworthy in this sense. While there may be a significant error rate in automatic captions (especially in the presence of background noise or music), deliberate alteration of meaning is highly unlikely, and the original audio is still immediately available for comparison.
I haven't revealed "what's under their kilts" yet. The reference -- and the relationship to this entire discussion -- comes from a short segment of dialogue in King of Hearts itself that was the trigger for my pondering this topic today.
In a particular captioned scene (the actors are speaking French at this point), a number of Scottish soldiers are dancing a jig. A character in the film asks her companion, "What's under their kilts?" To which the companion -- after taking a quick peek -- replies, "Nothing!" Leading to the response from another character, "You mean everything!" A rather cute turn of phrase.
But when I saw this scene a couple of evenings ago, I couldn't remember ever having heard the "Everything!" response before. In fact, I recalled -- from many years ago -- an entirely different and rather odd line of dialogue entirely.
I dug out an old tape and discovered that I was correct. The ancient dubbed King of Hearts version in my collection had the character replying to the question, "What's under their kilts?" with the response I remembered: "Petticoats!" (making the following line, "You mean everything!" completely nonsensical).
Since this was a dub job, I couldn't hear the original French dialogue, and I can't effectively lip-read French (or any other language, for that matter).
I'll admit that a rather ham-fisted attempt at film sanitizing may not be a big deal in the scheme of things -- but it had me fooled for many years.
Still, these issues -- particularly the key potential to verify captions by inspection of the original audio -- may be particularly important (for example) in the context of sound bites with political ramifications, where unscrupulous parties might try to post materials with falsified translations aimed at particular target audiences.
While one might hope that Internet access to underlying source materials and references would tend to reduce such risks, the plain truth is that many persons will simply accept what they see or hear the first time around and never think to go digging on the Net for verification.
Just a little something to consider, especially if you're ever inclined to doubt the ever-growing importance of captions in our increasingly video-centric world.
Greetings. One of the techniques that ICM Registry has been using to try demonstrate public demand for a dot-ex-ex-ex top level domain (TLD) has been touting various "poll" results. Right now they're pushing a new CNN "poll" that seems to show an amazing 83% approval rate.
But wait a second. What are we actually talking about here? Turns out that the CNN "poll" wasn't a scientific poll at all -- merely a scientifically worthless "self-selected" online poll. Statistical value and meaning: virtually nil.
ICM also promotes other magazine and newspaper polls over the years that gave similar lopsided numbers. It's unclear from their statements whether any or all of those were also self-selected polls, but it appears quite possible. Way back in 2004, ICM hired Lombardo Consulting to poll 1000 people on the topic (1K is indeed a typical national scientific poll size), and got similar results.
But even aside from issues of self-selection, the key to polling is of course the nature of the questions. Remember How to Lie with Statistics? Still a classic ... [As always, I've substituted "dot-ex-ex-ex" below to avoid e-mail blocking problems]:
CNN: Do you think pornographic websites should have their own "dot-ex-ex-ex" domain?
Business Week: Should purveyors of porn get their own domain?
Huntington Herald Dispatch: Would creating dot-ex-ex-ex keep Internet users from accidentally stumbling upon porn sites?
And finally, Lombardo: If those who run the Internet could assist in preventing child pornography and make the Internet safer for children and families by creating a dot-ex-ex-ex Internet address, would you support this?
It's difficult to imagine a more intellectually dishonest set of questions. Leaving aside loaded words like "purveyors" -- the questions appear to obviously suggest that all targeted sites (Only professional? Also amateur? Just U.S. or worldwide?) would be somehow limited exclusively to the dot-ex-ex-ex domain.
And it appears (from what I can determine so far at least) that no significant mention was made of the fact that the proposal includes no mechanism to force such sites to only reside in dot-ex-ex-ex (via oppressive "domain ghettoization" legislation or the like) -- which would certainly be quite appropriately subject to immense litigation battles. I wonder how these poll participants (self-selected or not) would have responded if it was made very clear that dot-ex-ex-ex was in addition to existing (e.g. dot-com) domain names?
The Lombardo question seems the most egregious, making completely unsupported claims about making the Internet safer and assisting in the prevention of c-porn. The latter is particularly ludicrous because c-porn is already illegal and no legitimate sources for such materials exist on any site or in any TLD.
In other words, the polling data being promoted by ICM Registry is misleading and biased, therefore statistically worthless -- and the quintessential essence of unmitigated bull.
Greetings. With much fanfare in mid-June, Minneapolis announced the activation of a free public Wi-Fi network, with 117 outdoor hotspots, "for use by residents and visitors alike."
Just one problem. At the apparently explicit "request" of law enforcement, you can't access the current system without first creating an account using a credit card!
We're told that, "[The] log-in process was requested by law enforcement officials because being able to log on to the Web anonymously presents security concerns."
One long-time expert on municipal wireless noted to me that the CEO of U.S. Internet (the firm operating Minneapolis' Wi-Fi system) claimed that federal law requires such a procedure. Say what? I've heard of no such law. Various public Wi-Fi systems require no log-in at all, and use of credit cards is normally restricted to systems that actually charge for access.
However, in Minneapolis, it appears to be "no credit card, no Wi-Fi" -- but if someone establishes an account in your name using a stolen credit card and then proceeds to do something nasty -- your hassle (or worse).
What's apparently going on in Minneapolis is a combination of the desire to enable the tracking of as much Internet activity as possible, and the time-honored tactic of CYA.
As I've noted in Why the New Federal "Trusted Internet Identity" Proposal is Such a Very Bad Idea and related essays, open access to the Internet is now under fire from a variety of government entities who want to be able to find out as much as possible about everything you do on the Internet, all of the time.
Efforts to predicate public Internet access on verifiable and easily trackable identification as a matter of course, should be strenuously resisted by all Internet users who care about their ability to routinely communicate as they choose without the threat of real-time or retrospective surveillance of their activities -- in an ever expanding circle of dubiously justified circumstances.