What’s Ghost and Referral Spam Traffic and Why Does it Suck?
Spam has evolved. It’s not just an inbox & search engine problem anymore. It’s found its way into your Google Analytics account. Just like how spammers will bend to the lowest denominator to try to squeeze into your email inbox, they’ve picked up on flaws in the system to show up in data reports.
With the dimmest glimmer of hope that you’ll wonder what the hell they’re doing in your report and visit their website out of curiosity.
Tell me about it! It makes data a mess—for both my personal sites and client’s sites that I work with at The Magistrate.
But, moar web traffic?
The thing is, these bots never actually visit your site.
They can still really skew your analytics numbers, including key stats like bounce rate and other engagement metrics.
If you’re making big content marketing investments based on these numbers, it’s important that they’re as accurate as they can be.
This has made ghost and referral spam traffic a big problem for:
- Small businesses and solopreneurs
- Medium businesses with no dedicated marketer
- Marketing Agencies small and large
And the kicker? These agents of Voldemort work fast. Real fast.
Not only are the numbers of hits from spam increasing everyday, but so are the sources that have to be blacklisted and eliminated.
We’ve even seen referral spammers try such nonsensical techniques as trying to disguise themselves as Google. Why? Who knows?
Here is what we see on our side:
It’s particularly troubling if your site is relatively new Australia Phone number and is not yet getting much legitimate web traffic. The spam percentages are much higher, and will skew your data much more than if your site has thousands of hits a day.
Here’s an example of a personal site of mine. I haven’t paid much attention to it, so it doesn’t get a lot of hits. But with a quick look at the orange segment, you can see that only 80% of the traffic recorded in Analytics is legitimate. 20% is spam traffic!
The bottom line is that you need clean data to make informed decisions about your website. And to do that, you need to address and clean up this mess.
Start now, because they are only going to improve their game.
Ever Wonder How Easy It Is?
A single referrer record in analytics is a single “page load”
That tracking “page load” took 0.001 seconds on a server somewhere. At the same time, that server was also loading 100 other “page loads” for different sites to muscle their way into everyone’s GA account.
When you consider how easy it is to buy another (twenty) $5 host, you’ll really grasp the how easy it is for this system to get way out of hand.
If the ROI is there, this problem gets far worse before it gets any better.
Coming Up Short: Tactics that Don’t Take it All the Way
This issue first became known to the public a few years ago when a mysterious online service called Semalt (hate these jerks) started to use the technique to appear on Analytics reports.
And, as always, social media reacted.
If you don’t believe them, believe me. It was everywhere–It’s still rampant.
But with a big problem comes an innovative solution, or so we thought.
As it turns out, these spammers are so active, and their technique is so good, that many techniques pitched as being a “solution” did not work.
Hell, you’ve probably tried a few of them yourself.
In preparation for this article, I went through my considerable amount of browser bookmarks and my Pocket archive to find all of the guides I had used before prioritizing this in-house fix for our team.
Techniques that do not actually solve this problem include:
- Changing your .htcaccess file — This method will not work with advanced tactics. Ghost spam never touches your site therefore renders this method useless.
- Using the referral exclusion/blocking list (read more) — Good setup but no updates.
- Sourcing exclusion lists into exclusion filters — Only excludes and blocks future spam & does nothing about the referrers from yesteryear.
The only one that really came close was the exclusion filter. The real problem there was that it was very difficult to find current and consistently-updated lists. Many of the founders/creators of such lists just weren’t actually invested in keeping a solution updated.
The constant maintenance required to keep a list like that up is prohibitive to it being an effective solution to the problem, especially when there is no profit in doing so.
The Missing Puzzle Piece
To be reasonable and effective, a solution to identify and weed out ghost and referral spam traffic would need to be:
- Very regularly updated
- Retroactive to past data
- Sourced from a large base of data
Using those principles as guidelines, we crafted the process that works so well for us now.
Step 1: Using Segments to Filter and Block Spam
Just in case you need a refresher:
- Filters allow you to include or block data from your reporting data set. Keep in mind that filters are destructive. Anything you filter and block, accidentally or otherwise is gone forever. They also cannot edit past data.
- Segments, on the other hand, are a subset of users or sessions. You can turn segments on and off, as they are not destructive, and can be applied to past data.