Action Items: What Google, Facebook, and Others Should Be Doing RIGHT NOW About Fake News

Today is action items day, and there isn’t a moment to lose before someone gets killed as a result of the fake news scourge. It nearly happened a couple of days ago, when some wacko invaded a pizza restaurant and shot it up looking for the youthful “sex slaves” that the fake “Pizzagate” story claims exist (a total fabrication created out of whole cloth and part of the complex of fake anti-Hillary sex stories even being promoted by highly-placed wackos in Trump’s White House circle). In fact, there are already new fake stories circulating regarding the shooting itself.

There are some ongoing efforts to begin dealing with fake and false news at the big firms. Facebook appears to be running an experiment asking some users to rate how “misleading” some link titles might be. This will no doubt collect some interesting data and may be a small portion of solutions, but of course cannot alone solve the underlying problems.

Having spent enough time inside Google to have some sense of how the world looks at Google Scale (i.e. “Big” with a Capital “B”), I am convinced that efforts to deal with the Fake/False News problem must primarily be based on algorithmic, automated systems. Humans will also still have important roles to play in this process in terms of tagging, flagging, and verification at least — especially for items that are suspected or verified fakes but are still trending upward very rapidly.

So, Action Item #1: We should be looking at automated systems for doing the bulk of the first level work to detect fakes, or else we’ll be swamped from the word go.

And I believe that the foundational resources to get this done do exist. Google and Facebook (just to name two obvious examples) have powerful AI architectures that could be leveraged toward such tasks, given the will to do so.

Action Item #2: We must understand the true dynamics of how fake and false news are shared — how they rapidly reach large numbers of users and push high into search results. It’s popular to simply assert that everyone believing/sharing these fake stories are just evil or stupid (or both).

That’s way too simplistic an assertion. Even over the very short time that my fake news data collection effort has been active, obvious patterns in the data are already emerging.

One pattern that hits you in the face immediately is that the vast majority of users who share fake news are not stupid and not evil, but they are very much confused by the misinformation surrounding them. There’s a sense that “Well, if it looks professional, or if this ranks highly in search, or if Facebook showed it to me, or my friends shared it with me, it at least might be true, there might something to it somehow, so I’ll share it too!”

This appears to be a far, far larger group of users than the ones who are actually generating and voluntarily wallowing in this trash. In fact, the latter group is voluntarily in their own “echo chambers” — and like with most any group of dedicated haters, Internet-based efforts to change their minds will likely be wasted.

But for much a larger segment of users who are misinformed, confused, and don’t even realize that they have become involuntarily trapped in echo chambers by fake and false news, there is definitely still hope.

This emphasizes a key point that various observers including myself have previously noted. Older users and other users with less Internet experience tend to believe items that look professional, that appear to be from sources that are visually attractive and seemingly structured in a more “news traditional” manner. On the other hand, younger users or other users with more Internet experience tend to care much less — or not at all — about the “professionalism” of the source and give much more credence to items that rank highly in search, are surfaced by services like Facebook, or are widely shared by their friends.

And this gets us to the crux of the matter. By and large, the Internet economy has evolved into a click-based popularity contest. Both in terms of search and social media, it is basically designed to surface content based on how many people appear to have interest in that content. That’s somewhat a simplification of course but it’s fairly close to the mark. And let’s face it, given two stories presented as accurate — one that discusses how people eat pizza, the other an actually fake story describing a nonexistent child sex ring — which is likely to get the most clicks — and so the most revenue?

While a variety of the big fake news sites are related to persons with political motives, a large number are operated by individuals who have no political motives at all — they are “merely” enriching themselves by creating false stories that they believe will get the most shares and “engagement” clicks for their own monetary enrichment.

On the other hand, I’ll tell you as one of the individuals involved in Internet development for decades that we did not build and grow the Net to be a tool for paying people to post fake news, nor to use such false content to help elect a lying sociopath as President of the United States.

Yet the click-based Internet economy is what it is, and alternative models such as subscriptions have seen only limited success. Other concepts such as micropayments even less so.

So what are we to do? This brings us to …

Action Item #3: I continue to strongly feel that censorship is not the best answer to this set of problems, and that more information — not less — is the path toward solutions. Downranking — where fake stories would still exist but no longer be so prominently featured in search results or system shares — can be a viable approach if handled with caution. In particular, only the most serious and dangerous fake content would typically be considered for manual downranking. For most fake news situations, organic (natural) downranking is a much more desirable procedure.

And that’s where labeling comes in. If fake news that has managed to reach high search results and massive sharing were labeled as fake or in some other relevant distinctive manner, I believe that this would give some pause to that large group of confused users, result in less sharing of fakes, and ultimately in the organic downranking of many such stories.

What’s more, in comments I’ve received it’s clear that many users are desperate for help in evaluating the truth of the content that comes pouring in at them now. How can we really blame them for accepting false stories as real when we don’t even make the effort to point out and label the fakes that we definitely know about?

Obviously it’s the case that detecting, evaluating, and labeling content on an Internet scale — even if we restrict our efforts to highly trending and highly ranked items —  is a very significant undertaking, even with the best of AI resources doing the bulk of the work. Such issues as the exact wording of labels can also be complex. Do we actually want to label a known false story as “false” per se? Snopes does this successfully at their relatively limited scale, but they don’t have particularly deep pockets, either (ironically but predictably, all manner of fake news stories are written and widely promulgated against Snopes). Another approach as an alternative to a specific “false” label would be the assigning of a kind of “confidence rank” to such stories — with the known fakes perhaps getting a rank of zero.

As always, the devil is in the details, but I’m convinced that some combination of these or related concepts can be made to work, especially given that the status quo is no longer tenable.

Action Item #4: Parody as a test case. The ability of many (most?) people to recognize parody or satire on the Net (unless it is clearly labeled) can be very poor. I ran into this myself when I wrote April Fools’ columns for the CACM journal — even with that highly technical audience some readers assumed that what I thought was obvious and outrageous satire was actually real. The same thing happened with a satire video I released on YouTube years ago as well.

A significant number of the “fake news” stories are sourced from satire sites (that is, at least ostensibly satirical sites — many seem to call themselves satire in small print to try cover fake items with clearly political motives, or mix fake and real items on their sites to cause even more confusion). Yet even items from known satire sources like “The Onion” — and “Borowitz” from “The New Yorker” — frequently explode into mass visibility without any indication that they aren’t “legit” articles.
In some cases this is just by virtue of the fact that typical sharing or search results may give no obvious indication that these are satire or parody — and such items may be innocently shared to large numbers of persons as if they were serious items. In other cases, the sharer knows that they’re dealing with satire but purposely promotes the items as non-satire if this fits with their political agenda of the moment.
In either case, if such stories were clearly marked (as parody or satire, referencing the original source) in search results or in Facebook shares, Twitter feeds, etc., the purposeful and/or accidental damage they can do when they’re inappropriately interpreted by users as serious items could be significantly reduced.

Such specific labeling of individual items that are known to be originally sourced from self-proclaimed satire/parody sites — irrespective of their current share or search results links — could provide something of an initial proving ground for the overall labeling concept. If such items could be identified in the various search and sharing systems as having such sites as their origins, it could help to demonstrate the usefulness of this labeling technique on this specific class of material that would be relatively straightforward to target. User reactions to these labels could then be studied toward the launch of a possible much broader labeling initiative dealing with fake/false news in a more comprehensive manner.

None of this will be easy, nor are these the only possible approaches. But we must immediately begin vigorously moving down the paths towards practical solutions to the serious, rapidly escalating issues of fake news and related problems on the Internet, unless we’re satisfied to be increasingly suffocated under a growing and ultimately disastrous deluge of lies.

I have consulted to Google, but I am not currently doing so — my opinions expressed here are mine alone.
– – –
The correct term is “Internet” NOT “internet” — please don’t fall into the trap of using the latter. It’s just plain wrong!

Study: Collecting URLs and Other Data About Fake/False News on the Net
Administrivia: Observing Google: "Tough Love"