Action Items: What Google, Facebook, and Others Should Be Doing RIGHT NOW About Fake News

Today is action items day, and there isn’t a moment to lose before someone gets killed as a result of the fake news scourge. It nearly happened a couple of days ago, when some wacko invaded a pizza restaurant and shot it up looking for the youthful “sex slaves” that the fake “Pizzagate” story claims exist (a total fabrication created out of whole cloth and part of the complex of fake anti-Hillary sex stories even being promoted by highly-placed wackos in Trump’s White House circle). In fact, there are already new fake stories circulating regarding the shooting itself.

There are some ongoing efforts to begin dealing with fake and false news at the big firms. Facebook appears to be running an experiment asking some users to rate how “misleading” some link titles might be. This will no doubt collect some interesting data and may be a small portion of solutions, but of course cannot alone solve the underlying problems.

Having spent enough time inside Google to have some sense of how the world looks at Google Scale (i.e. “Big” with a Capital “B”), I am convinced that efforts to deal with the Fake/False News problem must primarily be based on algorithmic, automated systems. Humans will also still have important roles to play in this process in terms of tagging, flagging, and verification at least — especially for items that are suspected or verified fakes but are still trending upward very rapidly.

So, Action Item #1: We should be looking at automated systems for doing the bulk of the first level work to detect fakes, or else we’ll be swamped from the word go.

And I believe that the foundational resources to get this done do exist. Google and Facebook (just to name two obvious examples) have powerful AI architectures that could be leveraged toward such tasks, given the will to do so.

Action Item #2: We must understand the true dynamics of how fake and false news are shared — how they rapidly reach large numbers of users and push high into search results. It’s popular to simply assert that everyone believing/sharing these fake stories are just evil or stupid (or both).

That’s way too simplistic an assertion. Even over the very short time that my fake news data collection effort has been active, obvious patterns in the data are already emerging.

One pattern that hits you in the face immediately is that the vast majority of users who share fake news are not stupid and not evil, but they are very much confused by the misinformation surrounding them. There’s a sense that “Well, if it looks professional, or if this ranks highly in search, or if Facebook showed it to me, or my friends shared it with me, it at least might be true, there might something to it somehow, so I’ll share it too!”

This appears to be a far, far larger group of users than the ones who are actually generating and voluntarily wallowing in this trash. In fact, the latter group is voluntarily in their own “echo chambers” — and like with most any group of dedicated haters, Internet-based efforts to change their minds will likely be wasted.

But for much a larger segment of users who are misinformed, confused, and don’t even realize that they have become involuntarily trapped in echo chambers by fake and false news, there is definitely still hope.

This emphasizes a key point that various observers including myself have previously noted. Older users and other users with less Internet experience tend to believe items that look professional, that appear to be from sources that are visually attractive and seemingly structured in a more “news traditional” manner. On the other hand, younger users or other users with more Internet experience tend to care much less — or not at all — about the “professionalism” of the source and give much more credence to items that rank highly in search, are surfaced by services like Facebook, or are widely shared by their friends.

And this gets us to the crux of the matter. By and large, the Internet economy has evolved into a click-based popularity contest. Both in terms of search and social media, it is basically designed to surface content based on how many people appear to have interest in that content. That’s somewhat a simplification of course but it’s fairly close to the mark. And let’s face it, given two stories presented as accurate — one that discusses how people eat pizza, the other an actually fake story describing a nonexistent child sex ring — which is likely to get the most clicks — and so the most revenue?

While a variety of the big fake news sites are related to persons with political motives, a large number are operated by individuals who have no political motives at all — they are “merely” enriching themselves by creating false stories that they believe will get the most shares and “engagement” clicks for their own monetary enrichment.

On the other hand, I’ll tell you as one of the individuals involved in Internet development for decades that we did not build and grow the Net to be a tool for paying people to post fake news, nor to use such false content to help elect a lying sociopath as President of the United States.

Yet the click-based Internet economy is what it is, and alternative models such as subscriptions have seen only limited success. Other concepts such as micropayments even less so.

So what are we to do? This brings us to …

Action Item #3: I continue to strongly feel that censorship is not the best answer to this set of problems, and that more information — not less — is the path toward solutions. Downranking — where fake stories would still exist but no longer be so prominently featured in search results or system shares — can be a viable approach if handled with caution. In particular, only the most serious and dangerous fake content would typically be considered for manual downranking. For most fake news situations, organic (natural) downranking is a much more desirable procedure.

And that’s where labeling comes in. If fake news that has managed to reach high search results and massive sharing were labeled as fake or in some other relevant distinctive manner, I believe that this would give some pause to that large group of confused users, result in less sharing of fakes, and ultimately in the organic downranking of many such stories.

What’s more, in comments I’ve received it’s clear that many users are desperate for help in evaluating the truth of the content that comes pouring in at them now. How can we really blame them for accepting false stories as real when we don’t even make the effort to point out and label the fakes that we definitely know about?

Obviously it’s the case that detecting, evaluating, and labeling content on an Internet scale — even if we restrict our efforts to highly trending and highly ranked items —  is a very significant undertaking, even with the best of AI resources doing the bulk of the work. Such issues as the exact wording of labels can also be complex. Do we actually want to label a known false story as “false” per se? Snopes does this successfully at their relatively limited scale, but they don’t have particularly deep pockets, either (ironically but predictably, all manner of fake news stories are written and widely promulgated against Snopes). Another approach as an alternative to a specific “false” label would be the assigning of a kind of “confidence rank” to such stories — with the known fakes perhaps getting a rank of zero.

As always, the devil is in the details, but I’m convinced that some combination of these or related concepts can be made to work, especially given that the status quo is no longer tenable.

Action Item #4: Parody as a test case. The ability of many (most?) people to recognize parody or satire on the Net (unless it is clearly labeled) can be very poor. I ran into this myself when I wrote April Fools’ columns for the CACM journal — even with that highly technical audience some readers assumed that what I thought was obvious and outrageous satire was actually real. The same thing happened with a satire video I released on YouTube years ago as well.

A significant number of the “fake news” stories are sourced from satire sites (that is, at least ostensibly satirical sites — many seem to call themselves satire in small print to try cover fake items with clearly political motives, or mix fake and real items on their sites to cause even more confusion). Yet even items from known satire sources like “The Onion” — and “Borowitz” from “The New Yorker” — frequently explode into mass visibility without any indication that they aren’t “legit” articles.
In some cases this is just by virtue of the fact that typical sharing or search results may give no obvious indication that these are satire or parody — and such items may be innocently shared to large numbers of persons as if they were serious items. In other cases, the sharer knows that they’re dealing with satire but purposely promotes the items as non-satire if this fits with their political agenda of the moment.
In either case, if such stories were clearly marked (as parody or satire, referencing the original source) in search results or in Facebook shares, Twitter feeds, etc., the purposeful and/or accidental damage they can do when they’re inappropriately interpreted by users as serious items could be significantly reduced.

Such specific labeling of individual items that are known to be originally sourced from self-proclaimed satire/parody sites — irrespective of their current share or search results links — could provide something of an initial proving ground for the overall labeling concept. If such items could be identified in the various search and sharing systems as having such sites as their origins, it could help to demonstrate the usefulness of this labeling technique on this specific class of material that would be relatively straightforward to target. User reactions to these labels could then be studied toward the launch of a possible much broader labeling initiative dealing with fake/false news in a more comprehensive manner.

None of this will be easy, nor are these the only possible approaches. But we must immediately begin vigorously moving down the paths towards practical solutions to the serious, rapidly escalating issues of fake news and related problems on the Internet, unless we’re satisfied to be increasingly suffocated under a growing and ultimately disastrous deluge of lies.

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!

Study: Collecting URLs and Other Data About Fake/False News on the Net

Greetings. I have initiated a study to explore the extent of fake/false news on the Internet. Please use the form at:

to report fake or false news found on traditional websites and/or in social media postings.

Any information submitted via this form may be made public after verification, with the exception of your name and/or email address if provided (which will be kept private and will not be used for any purposes other than this study).

URLs anywhere in the world may be reported, but please only report URLs whose contents are in English for now. Please only report URLs that are public and can be accessed without a login being required.

Thank you for participating in this study to better understand the nature and scope of fake/false news on the Net.

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!

Google Home Drops Insightful “Donald Trump Is Definitely Crazy” Search Answer

Two days ago, I uploaded the YouTube video linked below, which recorded the insightful response I received from Google Home to the highly relevant question: “Is Donald Trump Insane?” I noted Google’s accurate appraisal on Google+ and in my various public mailing lists. The next day (yesterday) the response was (and currently is) gone for the same query to Home — replaced by the generic: “I can do a search for that.”

Interestingly, this seems to have only occurred for responses from Google Home itself. The original (text-based) answer is currently still appearing for the same query made by keyboard or voice to Google Search through conventional desktop or mobile means (however, at least for me the response is no longer being spoken out loud — and I had earlier reports that the answer response was spoken on all capable platforms).

Let’s face it — what helps to make the original answer so great is the pacing and inflections of the excellent Google Home synthetic voice! It’s just not the same reading it as text.

There would seem to be only two possibilities for what’s going on.

One possibility is that the normal churning of Google’s algorithms dropped that answer from Home (and replaced it with the generic response) solely through ordinary programmed processes.

Of course, the other possibility is that after I publicized this brilliant, wonderful, and fully accurate spoken response, it was manually excised from Home by someone at Google for reasons of their own, about which I will not speculate here and now.

Either way, the timing of this change, only hours after my release of the related video, is — shall we say — fascinating.

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!

How Fake and False News Distort Google and Others

With all of the current discussions regarding the false and fake news glut on the Internet — often racist in nature, some purely domestic in origin, some now believed to be instigated by Putin’s Russia — it’s obvious that the status quo for dealing with such materials is increasingly untenable.

But what to do about all this?

As I have previously discussed, my general view is that more information — not less — is the best solution to these distortions that may have easily turned the 2016 election on its head.

Labeling, tagging, and downranking of clearly false or fake posts is an approach that can help to reduce the tendency for outright lies to be treated equivalently with truth in social media and search engines. These techniques also avoid invoking the actual removal of lying items themselves and the “censorship” issues that then may come into play (though private firms quite appropriately are indeed free to determine what materials they wish to permit and host — the First Amendment only applies to governmental restraints on speech in the USA).

How effective might such labeling be? Think about the labeling of “fake news” in the same sort of vein as the health warnings on cigarette packs. We haven’t banned cigarettes. Some people ignore the health warnings, and many people still smoke in the USA. But the number of people smoking has dropped dramatically, and studies show that those health warnings have played a major role in that decrease.

Labeling fake and false news to indicate that status — and there’s a vast array of such materials where no reasonable arguments that they are not untrue can reasonably exist — could have a dramatic positive impact. Controversial? Yep. Difficult? Sure. But I believe that this can be approached gradually, starting with top trending stories and top search results.

A cure-all? No, just as cigarette health warnings haven’t been cure-alls. But many lives have still been saved. And the same applies to dealing with fake news and similar lies masquerading as truthful posts.

Naysayers suggest that it’s impossible to determine what’s true or isn’t true on the Internet, so any attempts to designate anything that’s posted as really true or false must fail. This is nonsense. And while I’ve previously noted some examples (Man landing on the moon, Obama born in Hawaii) it’s not hard to find all manner of politically-motivated lies that are also easy to ferret out as well.

For example, if you currently do a Google search (at least in the USA) for:

southern poverty law center

You will likely find an item on the first page of results (even before some of the SPLC’s own links) from online Alt-Right racist rag Breitbart — whose traditional overlord Steve Bannon has now been given a senior role in the upcoming Trump administration.

The link says:

FBI Dumps Southern Poverty Law Center as Hate Crimes Resource

Actually, this is a false story, dating back to 2014. It’s an item that was also picked up from Breitbart and republished by an array of other racist sites who hate the good work of the SPLC fighting both racism and hate speech.

Now, look elsewhere on that page of Google search results — then on the next few pages. No mention of the fact that the original story is false, that even the FBI itself issued a statement noting that they were still working with the SPLC on an unchanged basis.

Instead of anything to indicate that the original link is promoting a false story, what you’ll mostly find on succeeding pages is more anti-SPLC right-wing propaganda.

This situation isn’t strictly Google’s fault. I don’t know the innards of Google’s search ranking algorithms, but I think it’s a fair bet that “truth” is not a major signal in and of itself. More likely there’s an implicit assumption — which no longer appears to necessarily hold true — that truthful items will tend to rise to the top of search results via other signals that form inputs to the ranking mechanisms.

In this case, we know with absolute certainly that the original story on page one of those results is a continuing lie, and the FBI has confirmed this (in fact, anyone can look at the appropriate FBI pages themselves and categorically confirm this fact as well).

Truth matters. There is no equivalency between truth and lies, or otherwise false or faked information.

In my view, Google should be dedicated to the promulgation of widely accepted truths whenever possible. (Ironic side note: The horrible EU “Right To Be Forgotten” — RTBF — that has been imposed on Google, is itself specifically dedicated to actually hiding truths!)

As I’ve suggested, the promotion of truth over lies could be accomplished both by downranking of clearly false items, and/or by labeling such items as (for example) “DEEMED FALSE” — perhaps along with a link to a page that provides specific evidence supporting that label (in the SPLC example under discussion, the relevant page of the FBI site would be an obvious link candidate).

None of this is simple. The limitations, dynamics, logistics, and all other aspects of moving toward promoting truth over lies in social media and search results will be an enormous ongoing effort — but a critically crucial one.

The fake news, filter bubbles, echo chambers, and hate speech issues that are now drowning the Internet are of such a degree that we need to call a major summit of social media and search firms, experts, and other concerned parties on a multidisciplinary basis to begin hammering out practical industry-wide solutions. Associated working groups should be established forthwith.

If we don’t act soon, we will be utterly inundated by the false “realities” that are being created by evil players in our Internet ecosystems, who have become adept at leveraging our technology against us — and against truth.

There is definitely no time to waste.

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!

Blocked by Lauren (“The Motion Picture”)

With nearly 400K Google+ followers, I’ve needed to block “a few” over the years to keep order in the comment sections of my threads. I’m frequently asked for that list — which of course is composed entirely of public G+ profile information. But as far as I know there is no practical way to export this data in textual form. However, when in doubt, make a video! By the way, I do consider unblocking requests, and frequently unblock previously blocked profiles as a result, depending on specific circumstances. Happy Thanksgiving!

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!