How to Prevent Spam Traffic in Google Analytics for More Accurate Data
Keeping up with spam and ghost traffic in Google Analytics can be aggravating if you want accurate data. While 100% […]
Keeping up with spam and ghost traffic in Google Analytics can be aggravating if you want accurate data. While 100% pristine data is not likely (nor worth the effort), a couple easy configurations can help get you close to this and safeguard your analytics from spam traffic.
Here are three tips that you can apply to your own Google Analytics account to help prevent spam traffic.
1. Hostname Filter
When a visitor is on your website — say, amazon.com — and triggers a Google Analytics hit (like a page view or event), the hostname for that visitor is recorded as your full URL: www.amazon.com.
In most circumstances, this hostname URL should be your domain, a subdomain of your site, youtube.com (if you have GA connected to your youtube channel), or in rare cases a custom URL for an email or registration signup form that also houses your GA account number.
Here’s an example of the Hostnames from Google’s own merchandising store analytics:
And here’s an example of spammy traffic (note the hostnames are all different):
To view this for your own data, go to:
- Acquisition > All Traffic > Referrals
- Set a secondary dimension of Hostname (as seen above)
If you see a hostname of (not set) or other unusual URLs (like your staging or dev site URLs), then you can fix these with the following filter:
- Filter Type = Custom
- Include Only => Hostname
- Hostname = getelevar|youtube
You’ll want to use your own domain instead of getelevar. I’ve also used a | which means “or” in regex.
This will suppress data from other hostnames from appearing in your reports.
2. Campaign Sources Filter
Another quick clean up you can do is through the Sources filter. This gets rid of the spammy source traffic data that might pass your sniff test for potentially “real” hostname traffic.
To set this, take the following steps:
- Filter Type = Custom
- Exclude Filter Field = Campaign Source
- Filter Pattern = brateg.xyz|budilneg.xyz|boltalko.xyz|abcdeg.xyz|biteg.xyz|bukleteg.xyz|buketeg.xyz
Note: the | allows you to add multiple domains to filter out at once. This is an example from the screenshot shown above in the hostname example.
For me, these sources are all spam and are dirtying up my data, so I’m excluding them all.
Unsure if certain data is legitimate or not?
If you see that there are 0 conversions, a > 95% bounce rate, 0:00 time on site, and 1 page view/session, it’s more than likely spamtraffic.
3. Native GA Bot Filter Setting
Hopefully you are already ticking this box (highlighted below) in your View Settings:
But if you aren’t, it only takes a second to set up.
Checking it will exclude hits from known bots (to Google) from appearing in your GA reporting data.
Higher Accuracy Drives Smarter Insights
Spam traffic is not only annoying — it can inflate/deflate metrics to the point where you make inaccurate assumptions in driving insights. It only takes a few minutes and filters to clean up your data.
Give it a shot!