Finding the Reason of Unexpected Direct Traffic in Google Analytics
Finding why you suddenly see lots of direct traffic can be a bit tricky due to the lack of source information.
So I will show you how to identify this unexpected traffic and most importantly how to clean it from your reports so you can get accurate analytics.
This post focuses only on direct traffic. For referral/keyword spam follow this guide
Possible Causes of Unexpected Direct traffic
I will split the common reasons of spikes in direct traffic in relevant and irrelevant.
Relevant direct traffic
This is the one that comes from real users that add valuable data to your Analytics
- Loyal readers returning to your site; there is nothing wrong with this traffic. Ideally, this would be the only type of direct traffic you should have on your reports.
- Incorrectly tagged campaigns, even if this is relevant traffic, you still need to fix it. If you recently launched a campaign, check that your links are properly tagged and are not missing any UTM parameter.
Irrelevant direct traffic
This traffic doesn't add any value and should be excluded from your Analytics.
- Internal traffic improperly filtered, especially if you recently did heavy testing on the site. To avoid this install IP filters for you and your team or block internal traffic dynamically.
- Ghost traffic wrongly done, this is rare but sometimes spammers forget how to spam. To stop it simply install a valid hostname filter.
- Referral Exclusion List used for spam, this list has other purposes, using it for spam will only strip the referral and leave it as a direct visit. Remove all the spam from the list and use a filter instead.
- Bot direct traffic, this the most common scenario and also the most complex, so I will focus the rest of the post to it.
Dealing with Direct Traffic Caused by Bots
The first thing you should do if you haven't done it already is to enable the option Bot filtering in your Google Analytics.This will exclude some hits from known bots and spiders.
Unfortunately, most bots are not included in this list, so you will need to exclude the ones that are affecting you manually.
What is a Bot?
A bot (aka web crawler, spider or robot) is an automated program or script which browses the internet gathering information. Some of them are beneficial to your site, like the Googlebot, while others are irrelevant.
However, no matter what is the purpose of the bot, the data left by bots in your Analytics is useless and may interfere with your real user's data.
Is this Direct Spam Traffic?
I see many people on Analytics forums calling this "Direct Spam Traffic." But is it?
To call it spam the bot should leave information like an URL with the intention of promoting a service, an idea, or getting something from you, like referral spam.
Bot direct traffic doesn't have any information, so I wouldn't consider it spamming.
Where do These Bots Come from?
There are thousands of bots crawling the web for different purposes; there are good bots and others no so much. These are some examples:
|Good Bots||Bad Bots|
|Search engine spiders||Spammers|
|Statistics sites||Scraping sites|
|Analytics services||Bots used to skew your resources (like DDoS attack)|
No matter if the bots have good or bad purposes, both cases are totally irrelevant for your Analytics and should be excluded from your reports.
In the case of bad bots, you will also need to block them from your server, especially the ones with the objective of damaging your site, hosting companies are usually very helpful with this type of stuff.
How to Identify Bot Traffic
Before filtering or segmenting out this traffic, is important to determine if it really comes from spiders and how to do it without excluding real users.
Let's start with a quick analysis of the data.
If you are experiencing dozens, hundreds or even thousands of direct visits out of nowhere, with a bounce rate close to 100% and an Avg. Session Time close to 0s, then most probably you are receiving bot traffic.
Common characteristics of bot direct traffic are:
- A sudden spike in direct visits.
- Default Channel Grouping: Direct (of course)
- Landing Page: most of the time is your home page usually represented by a backslash / or /index.html
- Bounce Rate is usually really high close to 100%
- Average Session Time is very low: close to 0 seconds
- Page views average 1 per session
Now not all data matching these come from bots. So you will have to play detective in your Analytics to find more specific characteristics that later can be used to filter or segment the data.
Since most of the time bots replicate their actions from the same system over and over leaving a trail, for example, a bot may run in Windows 7, Chrome 43 and Flash version 11.
To start go to the Direct traffic report on Analytics select the home page (/) and start adding different secondary dimensions to find common patterns. The more you find the better!
Dimensions worth checking:
- Browser/Browser version
- Operative system/ OS versions
- Browser size
- ISP or Network domain
- Flash version
Tip: Open a second window with the same report and select dates where the traffic was normal, then compare the data (browser versions, OS, flash version)
How to search Common Characteristics in Bot Traffic? click here to expand
How to Search for Common Characteristics in Bot Traffic
To find some characteristics that will help you exclude this traffic:
- Go to the reporting section of your Google Analytics and select the period were the Direct traffic occurred.
- Expand Acquisition and select Channels
- Then click on Direct and then on the Homepage (usually represented by a slash
- Once there start selecting different Secondary dimensions (at the top of the report)
Here are some characteristics I used from waves of direct traffic I've detected across several of my clients and may coincide with your situation.
- A) July 5, 2015: old flash versions (11.5 r502, 10.0 r183 and 13.0 r0)
- B) January 25, 2016: Chrome 43.0.2357
- C) In March 2016: Service Provider Hubspot
- D) In July 2016: ISPs from data centers
Cleaning Irrelevant Bot Direct Traffic
Once you find 1 or more patterns from the previous step (the more, the better), you can use them to create an advanced segment to exclude this traffic.
Why don't I use a filter instead? Ideally you would want to block this traffic, however, filters allow only one condition whether segments can have multiple. This makes the segments a lot safer to use.
However, if you find a very specific characteristic, that is very unlikely a real user, go ahead and create the filters. For example, very old versions or service providers from data centers or analytics tools.
Otherwise, you better create a segment.
How to Create a Segment to Clean Direct Bot Traffic
To remove bot traffic from Google Analytics:
- Again in the Reporting section of your Google Analytics
- Click on "+ Add Segment" at the top of any of your reports
- Click the red button "+New Segment"
- Almost at the bottom of the window select Conditions. (The first 2 conditions apply to any case, the other conditions will depend on your findings)
- Make sure Exclude is selected and set the conditions. First condition:
- Default Channel Grouping > exactly matches > Direct Click on "AND"
- Second condition:
- Landing Page > exactly matches > / Click on "AND"
- The third will depend on the pattern you found. Using some of the examples I previously mentioned
- a) Old Flash Versions: Flash Version > matches regex > 11\.5\sr502|10\.3\sr183|13\.0\sr0
- b) Hubspot provider: Service Provider > exactly matches > hubspot
- Set a meaningful name for the segment for example "0. All Users - No bots" and Save. All traffic matching this conditions will be excluded from your reports while the segment is selected.
Creating a segment with the 3a condition worked perfectly for some of my clients, it removed most of the unnatural direct traffic (orange).
Need help with this?
If for some reason you are not able to find the source of the unnatural traffic and need a hand, let me know! I can personally review your Analytics.
Your opinion is important
Bots usually crawl multiple sites, and it's possible that other people are having the same issue as you. By sharing your experience and findings, you may help others