Greetings. Fairly recently, I was interviewed for an article to appear in Redmond magazine (The Independent Voice of the Microsoft IT Community). The topic of the article was presented to me essentially as "Who has bigger and scarier privacy problems, Microsoft or Google?"
The resulting lengthy article, What Does Microsoft Know About You? has just appeared, and despite what I would characterize as an obvious and significant anti-Google bias in the overall piece, the article is still an interesting and worthwhile read, even if only to illustrate how these sometimes emotional issues are being played out in such publications and among such affinity groups.
But that's just my opinion. What's your take?
Update (7:30 PM): OK gang, yeah, I did notice that the article (at least as currently displayed in the online version) includes a rather amusing typo given the topic of the piece. Out of deference to the author, I hadn't planned to mention the error, but with so many people sending me notes about it, that doesn't seem as practical now. Let's just assume that it was a genuine typo, not a Freudian slip!
Greetings. An article on Slashdot today seems to blame Google and Android for the ease with which two Caller ID spoofing programs can manipulate Caller ID and gain illicit access to AT&T (and other) voicemail systems. It even attempts to draw in the (to my mind irrational) complaining about Google's accidental Wi-Fi payload data collection.
I've talked about CNID (Calling Number ID) spoofing various times before, but let's be really clear about this.
CNID spoofing is not the fault of Android or Google, any more than it's the fault of Time Warner or Comcast when users access Web-based CNID spoofing services. The fundamental problem is that the CNID system was never designed for an environment where, to use the vernacular, every Tom, Dick, and Harry has access to the underlying subsystems, a problem that has become much more serious with the rise of VoIP/SIP-based access mechanisms.
A rather comprehensive history of CNID spoofing [calleridspoofing.info] and related areas makes for useful reading. (This falls into the "it takes one to know one" category of Web sites, apparently.)
Google Voice, as an example of the correct approach, makes users explicitly aware of spoofing risks, and requires additional confirmation steps, if attempts are made to set up accounts without passcodes.
There are legitimate situations where manipulation of CNID data is completely reasonable. Services (like Google Voice, for example) may want to pass through calling number data so that called parties have accurate information regarding the origin numbers of callers. Businesses may want to send their main number as the CNID reference, not extension numbers, which may not even take incoming calls.
There are concerns that currently pending U.S. legislation to outlaw nefarious CNID manipulation might adversely affect legitimate uses. My belief is that it should be possible to craft wording in the final legislation that would protect such honest applications -- this is indeed important.
I do feel though that it is also important that U.S. federal law be on record that use of Caller ID spoofing for the purpose of intentionally falsifying the identity of a caller is generally unacceptable and so would normally be subject to appropriate legal sanctions.
Greetings. Last Friday, in White House Proposes Vast Federal Internet Identity Scheme, I posted a brief thumbnail expressing my major concerns regarding the expansive federal Internet Trusted Identity proposal.
Here are a few details explaining why I'm taking such a negative view of this plan.
It's important to note that this entire proposal under discussion, at this stage, is of course nothing but smoke. It has no functional reality, other than as a (useful) starting point for further discussion. But when viewed in the context of other government-related efforts, trends, and statements, it is quite alarming nonetheless, and it's very difficult to overstate its potential for serious negative consequences. Though indeed, like the vision of Christmas Future provided to Ebenezer Scrooge, it's currently only a shadow of what might be, not of what must or necessarily will be.
Let's look at one of the "Envision It!" boxes in the plan as posted at the Department of Homeland Security:
An individual voluntarily requests a smart identity card from her home state. The individual chooses to use the card to authenticate herself for a variety of online services, including:
Credit card purchases,
This is, by definition, a government-issued identity card. The plan appears to envision a user authenticating themselves for the purposes even of pseudonym-based or "anonymous" activities. We can call such a posting "anonymous" if we wish -- but if the user has already authenticated, we're then dependent on the "proper" behavior of all players to actually treat the following transactions in a truly anonymous manner.
And anonymous to what extent? Perhaps a blog comment would appear on the Web anonymously, but when the lawyers show up demanding to know who posted that critical comment -- something that's happening with increasing frequency even now -- I'll bet you dollars to donuts that the initial authentication records will be available through some means to unmask the poster, or to correlate pseudo-identities that users may prefer to use for different purposes and "roles" on the Net.
The goals behind such an all-encompassing identity regime seem clear. While it could indeed provide some improvements over existing authentication methods in financial transactions and the like, the cost to civil liberties could be very high indeed, because -- as I read the plan -- the end result would be a detailed record -- likely captured by upcoming government proposals for expansive Internet service data retention requirements -- that could be used to "unwind" (unmask) anonymity on demand.
As I noted in Saving Internet Anonymity -- The Struggle is Joined, the increasingly shrill calls to put every possible Internet transaction into government-accessible databases has become an ever louder drumbeat.
And I believe we can easily dismiss the term "voluntary" used in the proposal -- since there's every reason to believe that such authentication regimes would quickly become effectively mandatory -- due to various pressures and liability concerns that don't take a lot of imagination to understand. Identity "mission creep" is virtually a certainty, though the conflicts that this is likely to create in an international environment like the Internet are certainly interesting to contemplate.
History, both long past and recent, shows us very clearly that -- human nature being what it is -- governments on the whole can't be trusted to not abuse data about their citizens' activities. Such abuse will almost always evolve from what initially appears to be laudable motives of law enforcement and the public welfare, but could rapidly degenerate into totalitarian nightmares.
Even if you (appropriately) view our current and recent federal governments as essentially relatively benign, we've still seen many instances of unjustifiable and even illegal surveillance and Internet data abuse -- even in the absence of long-term data retention requirements of the sort now being contemplated.
And even with the best of intentions, firms who are the custodians of user data and identity info are at the mercy of the civil legal system, above-board government demands for data, and -- as we've seen -- "secret" government data demands as well.
What of future governments, who might not be as benign, but would have at their fingertips the vast Internet identity infrastructure being contemplated -- what will they do with that shiny bauble?
I'm all in favor of discussions about how the Internet industry can improve the security and validity of transactions that need strong authentication -- such as in the financial sector or when dealing with medical health records. But the sort of government-entangled identity structure being proposed by the White House in the current document is -- perhaps even to a very significant degree unintentionally and with genuinely good intentions -- a wolf in sheep's clothing with the potential to decimate civil liberties on and off the Net for generations to come.
Greetings. The White House has just released the draft of a rather chilling document -- tellingly hosted on Department of Homeland Security servers -- that proposes the creation of a vast, federally-led "Trusted identities in Cyberspace" infrastructure that would potentially reach into nearly every aspect of Internet use, from financial transactions to comments on blogs. The White House is seeking public comments on the proposal.
While touted as a voluntary public/private partnership toward universal Internet identities, it seems clear from an initial reading that such a scheme is a preemptive push toward what would eventually be a mandated Internet "driver's license" mentality of the sort I've been warning against (e.g. Saving Internet Anonymity -- The Struggle is Joined -- April/2010).
It is certainly true that there are some specific situations while using the Internet during which strong identity credentials are very useful, and various of the problem scenarios outlined by the White House draft are real to one degree or another. But Internet industries have been working effectively to develop systems, such as OpenID, that can address such concerns in a truly voluntary manner without government involvement or interference, and without requiring or coercing individuals into sharing identities across multiple sites against their wills.
Let me put it this way in brief for now. Attempts by the federal government -- or other governmental entities for that matter -- to usurp leadership roles in any aspect of Internet identity ecosystems should be politely but strongly rejected.
I will have much more to say about this in the future, but since many people were already asking me about the White House draft, I wanted to get this initial thumbnail analysis out the door as quickly as possible.
Frankly, the concept of the federal government taking their proposed role in this area, especially in today's political climate -- is so obviously unwise -- and perhaps potentially dangerous -- that it's not even a close call. This is especially true given the increasing calls from some in government for massive Internet data retention regimes that could easily be linked with such federally-coordinated Internet ID systems.
I am hosting a local PDF copy of the White House draft here: White House Identity Draft
You can also download the document from the Department of Homeland Security.
More to come.
Blog Update (27 June 2010): Why the New Federal "Trusted Internet Identity" Proposal is Such a Very Bad Idea
Greetings. As predicted in ICANN Likely to Approve "Dot-Ex-Ex-Ex" Domain for Chumps!, it is now reported by AP that ICANN is moving ahead towards final approval of the dot-ex-ex-ex TLD (Top Level Domain).
Note in the AP piece how the fellow behind ICM Registry -- the proposed operator of dot-ex-ex-ex -- is already crowing about the bundles of money he believes he'll make, and claiming that he'll require registrants to include meta labeling on their Web pages, presumably so that they can be widely blocked. Wow, such an attractive deal for adult sites! Why not just pour nitric acid into your servers? The ultimate effect would likely be very similar.
I believe that if dot-ex-ex-ex is "finally" approved, we can expect that:
(a) Dot-ex-ex-ex blocking will become essentially a default condition for much of the Internet, enforced by governments, organizations, and many ISPs in various ways.
(b) Various governments will attempt to mandate that "adult entertainment" sites sign up in and/or move totally to dot-ex-ex-ex (likely what ICM is actually hoping for).
(c) A flurry of lawsuits on all sides will be forthcoming.
(d) Dot-ex-ex-ex will embolden legislative efforts to force other categories of inconvenient or "undesirable" Internet speech and materials into their own more easily blocked TLDs and labeling regimes.
The DNS (Domain Name System) has become a means to extort protective name registrations from sites who really have no desire to be involved in new TLDs, a mechanism for further confusing consumers (and incidentally, enhancing the value of dot-com), and now, an enabling mechanism for Internet censorship and "thought control."
By the way, why do I keep referring to the TLD under discussion as "dot-ex-ex-ex"? Because if I use its real label, e-mail filters will automatically reject my associated mailings at many sites -- which helps to demonstrate how utterly insane this entire situation has become.
Greetings. Word is that tomorrow ICANN is likely to reverse itself yet again, and (under continuing lawsuit threats from the would-be TLD-operator who desperately wants to cash-in on this fiasco) unwisely approve a "dot-ex-ex-ex" top-level domain.
Every word that I wrote back in 2005 about this topic, in Open Letter: Why "Dot-Ex-Ex-Ex" is for Chumps still holds true, perhaps even more now half a decade later.
If ICANN moves as reported on this, it is bad news for everyone concerned about free speech and civil liberties on and off the Internet, regardless of how you feel about the sorts of enterprises being targeted by this new TLD.
Blog Update (25 June 2010): ICANN Moves Forward with Dot-Ex-Ex-Ex, while ICM CEO Plans for Big Bucks and Censorship
Greetings. I've said it often -- once data is on the Internet, never assume that it can ever really be completely controlled or removed.
Promoted as a sort of "privacy-enhanced" version of YouTube, the VidMe spiel is that they provide a video hosting service where you can control exactly who has access to your videos at any given time, revoke video playback access whenever you want, prevent downloading and forwarding of videos without your permission, and so on.
VidMe is attempting to tap into concerns regarding videos (potentially embarrassing, or otherwise where public viewing is not desired) that fall into the wrong hands or go unexpectedly and undesirably viral.
YouTube already provides three types of privacy control tiers (other than the default of public access) -- private videos, group shared videos, and the new (and very useful) "unlisted" video feature. VidMe takes this a step farther with per-user granularity in access controls, and reportedly implements some additional mechanisms to try make it harder for persons to access or save copies of videos without the owners' permissions.
Since VidMe is basically selling a promise of privacy, one would hope that it could actually provide the advertised abilities for owners to prevent unauthorized viewing or distribution of videos. This is especially important since VidMe apparently plans to eventually charge users to upload videos to the service, beyond a few free videos per account.
But VidMe has some significant problems, not the least of which being that they cannot deliver the level of video privacy and control that they seem to be promising -- not due to any technical limitations in their service per se, but rather because that kind of privacy control is essentially unattainable in the current public Internet environment.
The VidMe flash player seemed very slow to buffer and play with all browsers that I tested. It hung, crashed, and burned whenever I tried to play test VidMe videos under Google Chrome.
OK, that stuff almost certainly can be fixed. But a much bigger problem for the VidMe "control your videos' distribution" business model is that every single technique I tried to locally capture displayed VidMe videos was fully successful without any difficulty whatsoever.
Every video stream grabber utility that I executed was able to capture and locally store both video and audio from VidMe playback streams. There are some video sites that at least make this sort of stream capture more difficult -- VidMe isn't one of them.
And just for chuckles (since the results seemed preordained) I also easily captured VidMe playbacks using the freeware, CamStudio Open Source package, which quickly and neatly enables high frame rate, high-quality screen and audio grabs directly from display buffers -- no need to capture the actual data streams themselves.
In every case, in every test, I ended up with fine looking video copies, complete with audio tracks, that I could -- if I had wished -- post anywhere or forward to anyone without restrictions.
My real gripe here is with how VidMe is promoting their service, and the extent to which unsuspecting users who might not understand the technical realities of Internet video could be painfully surprised if they took VidMe's pitch at face value.
Common sense alone should remind us that if nothing else, anyone could aim an inexpensive digital camcorder at a computer display and capture a low-quality copy for distribution. And if VidMe wishes to assert that most people don't have the stream or display capture software that I used for testing, or wouldn't bother to use them, that's OK -- but at least dial back the promotional language that could easily mislead many persons into believing VidMe provided a level of video privacy and control that is simply impossible in the existing Internet ecosystem.
VidMe's fine-grained site playback access controls do have value in and of themselves, though I frankly have my doubts that their ultimate pay-to-upload plan is viable from a business standpoint.
But make no mistake about it -- videos played via VidMe, just like from every other video site on the Net, can be captured and redistributed without permission -- one way or another.
Love it or hate it -- that's just the way it is.
Greetings. The New York Times has published a good article about Web video captioning, which provides me with a convenient hook to briefly discuss this very important topic.
I can't emphasize enough why captioning is so vital for the entire Web. It provides the critical link between text and the audio layer of video presentations, not just for the crucial purpose of serving the hearing-impaired community, but also to enable video content search for all users (e.g., to find videos in the main Google index based on narration or dialogue), and to enable associated automated language translations for everyone.
Google's work in the area of YouTube videos "auto-captioning" is particularly fascinating, in far more ways than I can discuss here right now.
If you have videos on YouTube, I urge you to explore the various captioning control and enhancement options that are now present in your YouTube account video controls. In particular, the Google-provided auto-captioned transcripts can be used as the basis to manually "clean up" errors in the automated captions for videos, much more rapidly than videos could normally be captioned manually from scratch.
As I understand the current situation, the caption texts from purely auto-captioned videos are not currently included in YouTube/Google search results due to the perceived auto-captioning error rate (though in many, even most cases, that rate seems to be quite low). So it's important at this stage to "clean up" the auto-captions on your videos yourself whenever possible, so that your video captions can be integrated into the search databases.
Please let me know if you're interested in more information regarding this area.
Greetings. As I noted recently in "Highly Illogical": The Hysteria Over Google's Wi-Fi Scanning, the unseemly and opportunistic attacks, lawsuits, and now perhaps even criminal prosecutions of Google over their accidental recording of unencrypted Wi-Fi payload data seem to call into question the overall rationality of our species.
After all, these were unencrypted transmissions being broadcast on public airwaves, and Google's accidental capturing of data snippets can hardly compare with the risks to those Wi-Fi owners of bad guys purposely collecting that data to actually use for evil purposes (but even then, we're only talking about data that wasn't protected above the Wi-Fi layer by mechanisms such as SSL/TLS).
On the other hand, it's obviously to be expected that Google's adversaries (including some governments with somewhat irrationally conflicted views over public vs. private data, imagery, etc.) would seize on any slip to try stake Google out for the wolves.
But ultimately, public is public. Information that is disseminated in unencrypted forms is always going to be vulnerable to purposeful or accidental interception, and the solution to this situation is encryption, not legislation.
I had an interesting personal incident occur recently that may be at least a bit illuminating in this "what is public?" discussion.
For several years, I've sometimes dictated the initial drafts of particularly long papers or reports into an inexpensive hand-held digital recorder. I blab my thoughts into this thing wherever I am, later dump the audio data files via USB, then run them through speech-to-text software (usually NaturallySpeaking) -- typically with highly satisfactory results.
Brief Aside: Speech recognition systems have long been one of my areas of interest. The availability of speech-to-text systems today always strikes me as a true science fiction concept brought to fruition. (Here's a one-minute video clip I threw together featuring two 20th century science fiction TV show concepts of "futuristic voice dictation" - from 1979's original Battlestar Galactica, and -- a bit more tongue-in-cheek -- from Star Trek in 1968.)
Anyway, one day a few months ago when I examined the automatically transcribed results of the last week's dictation dump, I was startled to find (sometimes garbled, sometimes intelligible) snippets of conversations in the resulting text, that had nothing to do with what I had dictated!
What the ...? I went back and listened to the original audio files, which I normally didn't do (my standard protocol is to simply upload the audio data files and later inspect the resulting text).
The problem's source was immediately apparent. There were other voices in the background of some recordings, that had been picked up in the vicinity of where I had been dictating in stores, fast food eateries, and so on. Many of these voices were clear enough that the speech software had tried -- often successfully to a considerable degree -- to transcribe these along with my intended verbiage. I deleted the original audio files as is my standard practice, and edited out the erroneously collected text snippets.
The "foreign" remarks were all pretty much meaningless bits and pieces -- a few words here and there -- but why had this suddenly occurred and how had it gone on for days unnoticed? After all, I've been using this recorder for years, often in public places, and it had never picked up anything from other conversations before.
I found the reason. On the back of the recorder is a tiny little flush switch, that I had never knowingly altered, that selected between high and low microphone sensitivity. I had always left it on the "low" setting, which caused the unit to effectively ignore all but my own voice. Somehow that switch had moved to the "high sensitivity" position, causing the unit to pull in surrounding voices as well as my own. There was no obvious indication of this, and I didn't even notice the switch since I carry the recorder in a small case.
You know where I'm going with this. The accidental recording of very short ambient background speech snippets doesn't represent a real risk to anyone, just as Google's accidental recording of unencrypted Wi-Fi payload snippets was an unfortunate oversight, not an evil plot.
We need to understand the fact that unless we take steps to protect what we consider to be "confidential data" in public spaces, that data is vulnerable to be overheard not only accidentally as in both of these cases described above, but also by bad actors who truly have nefarious goals -- and it's the latter group that we really need to be concerned about.
This holds true in the world of Wi-Fi, and in the more mundane environs of the local burger joint ordering queue.
Trying to treat public spaces as if they were somehow legislatively "private on demand" is ultimately a fool's game.
Greetings. I've been booked as part of panel to discuss Internet issues on China Radio International's Today show airing live this Wednesday, 16 June, from 1000-1100 Beijing Time (that's 1900-2000 Tuesday, 15 June, Pacific Daylight Time).
China Radio International (CRI) is a state-owned media network heard over-the-air in major Chinese cities, through various AM and FM broadcast stations around the world, over the Internet (live and podcast), and via shortwave, satellite, etc. (More info.)
While I don't yet have a complete list of topics for the show, I believe they will be of broad interest relating to the Internet in general, and to issues focused on China's rapidly expanding Internet operations in particular.