Fixing Google’s Gmail Spam Problems

The anti-spam methodology used by Google’s Gmail system — and most other large email processing systems — suffers a glaring flaw that unfortunately has become all too traditionally standard in email handling.

One of the most common concerns I receive from Google users is complaints that important email has gone “missing” in some mysterious manner.

The mystery is usually quickly solved — but a real solution is beyond my abilities to deploy widely on my own.

The problem is the ubiquitous “Spam” folder, a concept that has actually helped to massively increase the amount of spam flowing over the Internet.

Many users turn out to not even realize that they have a Spam folder. It’s there, but unnoticed by many.

But even users who know about the Spam folder tend to rarely bother checking it — many users have never looked inside, not even once. Google’s spam detection algorithm is so good that non-spam relatively rarely ends up in the Spam folder.

And therein lies the rub. Google’s algorithms are indeed good, but of course are not perfect. False positives — important email getting incorrectly relegated to the Spam folder — can be a really big deal — especially when important financial notifications are concerned, for example.

In theory, routine use of Gmail’s “filter” options could help to tame this problem and avoid some false positives being buried unseen. But the reality is that many of these important false positives are not from necessarily expected sources, and many users don’t know how to use the Gmail filter system — and in fact may be totally unaware of its existence. And frankly, the existing Gmail filtering user interface is not well suited to having large and growing numbers of filters of the sort needed to try deal with this situation (either from the standpoint of actual spam or false positives) — trust me on this, I’ve tried!

So could we just train users to routinely check the Spam folder for important stuff that might have gotten in there by accident? That’s a tough one, but even then there’s another problem.

Many Gmail users receive so much spam — much of it highly repetitive — that manually plowing through the Spam folder looking for false positives is necessarily time consuming and prone to the error of missing important items, no matter how careful you attempt to be. Ask me how I know!

This takes us to the intrinsic problem with the Spam folder concept. Gmail and most other major mail systems accept many of the spam emails from the creepy servers that vomit them across the Net by the billions. Then they’re relegated to users’ spam folders, where they help to bury the important non-spam emails that shouldn’t be in there in the first place.

Since Google accepts much of this spam, the senders are happy and keep sending spam to the same addresses, seemingly endlessly. So you keep seeing the same kinds of spam — ranging from annoying to disgusting — over and over and over again. The sender names may vary, the sending servers usually have obviously bogus identities, but (unlike some malware that Google rejects immediately) the spam keeps getting delivered anyway.

The solution is obvious, even though nontrivial to implement at Google Scale. It’s a technique used by many smaller mail systems — my own mail servers have been using variations of this technique for decades.

Specifically, users need to be able to designate that particular types of spam will never be delivered to them at all, not even to the Spam folder. Attempts at delivering those messages should be rejected at the SMTP server level — we can have a discussion later about the most appropriate reject response codes in these circumstances, there are various ways to handle this.

Specifying the kinds of spam messages to be given this “delivery death penalty” treatment is nontrivial, both from a user interface and implementation standpoint — but I suspect that Google’s AI resources could be of immense assistance in this context. Nor would I assert that a “real-time” reject mechanism like this would be without cost to Google — but it would certainly be immensely useful and user-positive.

The data from my own servers suggests that once you start rejecting spam email rather than accepting it, the overall level of spam attempts ultimately goes down rather than up. This is especially true if spam attempts are greeted with a “no such user” reject even when that user actually exists (yes, this is a controversial measure).

There are certainly a range of ways that we could approach this set of problems, but I’m convinced that the current technique of just accepting most spam and tossing it into a Spam folder is not helping to stop the scourge of spam, and in fact is making it far worse over time.

–Lauren–

Location Tracking: Google's the One You DON'T Need to Worry About!
Beware the Fraudulent Blog Comments Scams!

4 thoughts on “Fixing Google’s Gmail Spam Problems”

  1. Hi Lauren,

    I had come across your article on 1-30-2019, after doing further research on the topic of spam e-mail filtering. In July of 2017, I was hit with a DDoS attack and then a directed, DoS attack on my personal Gmail address account and survived both attacks. It took me a year and a half to tame the spam e-mail in my Gmail e-mail address, after both attacks. Sadly though, I still have one remaining spamming network and that one is a worldwide known as problematic. With more and more restrictive filtering and now contacting the spammer’s mail server, I am hoping I can eliminate the spam e-mail from this notorious, worldwide known spamming network. In the past year and a half [and currently still going,] I was forced in to a crash course of how e-mail works and how it is handled across the internet.

    Specifically regarding Gmail from Google, I agree that the way things are handled by Google with its Gmail product with the ‘filtering’ of spam e-mail on the end-user’s side, is antiquated and is a fruitless effort by someone not proficient in correct filtering of content. There should be end-user access in Gmail to filter out spam e-mail at the mail server level, using SMTP denial protocols, for advanced users. I’ve personally sent messages to Google’s Abuse Department requesting the placement of two different mail servers at the mail exchange level (MX) for my personal Gmail account, to deny e-mail from those mail servers. Each time, I have not heard back from Google’s Abuse Department at all (I was not really expecting a response anyway.) But, I have received an abrupt stop in the spam e-mail from the directed, DoS spamming network… Though, it might be because I escalated the complaint of the directed, DoS attack to an internet safety consortium that went after the guilty party of the directed, DoS attack at the same time I first asked Google for the MX blacklisting of a mail server.

    The internet is for the most part, safe. But, there are obvious flaws that ‘can be’ tamed, if not fixed, if e-mail service providers would actually consider the view of the public and make an honest attempt to squelch spam e-mail. The ‘react to’ way of doing business against spam e-mail, is severely outdated.

  2. Amen. Not only would rejecting the incoming mail dissuade spammers, it would also let senders of legitimate false positives know that that email was not going to be read by the recipient so at least they could try other forms of contact rather than just having a message be silently lost forever.

    It is frustrating that GMAIL has such a ham handed approach to spam control. With a few enlightened steps they really could make a huge impact on the problem and save the world so much wasted time and effort.

  3. 500 plus spam emails per day in my gmail spam filter, NO way to automatically delete them. EVERY day, at least an hour to clean the spam folder, one at a time. Gmail updates only make the entire gmail system worse. A simple way to automatically delete them as they arrive would sure help.

    1. Gmail does now reject some sites’ email as spam at SMTP ingress, but as you can imagine this is a continuing game of Whac-a-Mole as spammers rotate through an essentially endless series of outbound sites that are hosting spammer activities (knowingly or not).

Comments are closed.