Author

Analytics Expert. Passionate about SEO and User Experience or what he calls UX-SEO

Follow me on

How to Identify and Stop Bot Direct Traffic in Google Analytics

Finding the reason of unnatural spikes of direct traffic may be tricky due to the lack of source information. To help you identify the problem and clean it from your reports I'll show you the possible causes of a sudden increase of direct traffic in your Google Analytics.

This post focus only in bot direct traffic, for Analytics spam (referral, keyword, language, etc) use this guide

Direct traffic caused by bots

The most common reason for a sudden increase in direct traffic in Google Analytics is bots, so let's analyze what is a bot, how to identify it and most importantly how to exclude it from your reports.

What is a bot?

A bot (aka web crawler, spider or robot) is an automated program or script which browses the internet gathering information. Some of them are beneficial to your site, like the Googlebot, while others are irrelevant. But no matter what is the purpose of the bot, the data left by bots in your Analytics is useless and may interfere with your real user's data.

Where do these bots come from?

There are thousands of bots crawling the web for different purposes; these are some examples:

Search engine crawlers Ads networks Analytics services
Statistics sites Scraping sites  

Before filtering out, this traffic is important to determine if it really comes from spiders or real users. 

If you are experiencing dozens, hundreds or even thousands of direct visits out of nowhere, with a bounce rate close to 100% and an Avg. Session Time close to 0s, then most probably you are receiving bot traffic.

Common characteristics of bot direct traffic are:

  • Sudden spike in direct visits
  • Default Channel Grouping: Direct
  • Landing Page: / (or your homepage URL \home \index.html)
  • Bounce Rate: close to 100%
  • Average Session Time: close to 1s
  • Pageviews: 1 per session
Direct traffic on July 2015-2

And here are some common waves that I've detected in some of the accounts I've worked with.

  • July 5, 2015, characteristic: old flash versions (11.5 r502)
  • January 25, 2016, characteristic: Chrome 43.0.2357
  • In March 2016 characteristic: Service Provider Hubspot
  • In July 2016 characteristic: Multiple new service providers

Once you determine that the direct traffic comes from bots, you should try to find other particular characteristics that will help you exclude it from your reports.

How to pin down bot traffic and exclude it from your Analytics 

The first thing you should do if you haven't done it already is to enable the option Bot filtering in your Google Analytics. This will exclude some hits from known bots and spiders. Unfortunately, this only works for future hits and many bots are not included in this list.

To exclude the rest of the bots, you should use an advanced segment.

You will have to play detective and find some clues that will help create the conditions for the segment; the city, the browser version, service provider, anything that represents this traffic and helps to exclude it safely.

How to find patterns in direct bot traffic

To find some characteristics that will help you exclude this traffic:

  1. Go to the reporting section of your Google Analytics and select the period were the Direct traffic occurred.
  2. Expand Acquisition and select Channels
  3. Then click on Direct and then on the Homepage (usually represented by a slash /)
  4. Once there start selecting different Secondary dimensions (at the top of the report) 
    Secondary Dimension Google Analytics Reports

Start looking for patterns between the different dimensions.

[optin-monster-shortcode id="luxsb9a7m0ze7c8j"]

For example, these are some common characteristics of the common waves mentioned previously:

  • Service Provider: Hubspot
  • Old Flash versions rarely used by real users: 11.5 r502, 10.0 r183 and 13.0 r0
  • Browser Version: 43.0.2357 (Chrome)

Tip: Open a second window in with the same report with dates where the traffic was normal to compare the differences.

Once you find 1 or more patterns (the more, the better) you can use them to create an advanced segment to exclude this traffic.

Need help to find the reason of the unnatural traffic?

I can personally assist you to find patterns and create filters or an advanced segment to exclude the unnatural direct traffic.

How to clean direct bot traffic in Google Analytics

To remove bot traffic from Google Analytics:

  1. In the Reporting section in your Google Analytics
  2. Click on "+ Add Segment" at the top of any of your reports
    Google Analytics Create Segment
  3. Click the red button "+New Segment"
  4. Almost at the bottom of the window select Conditions
    Unusual direct traffic Segment Google Analytics
  5. Make sure Exclude is selected, and set the conditions. First condition:
    • Default Channel Grouping > exactly matches > Direct Click on "AND"
  6. Second condition:
    • Landing Page > exactly matches > / Click on "AND"
  7. The third condition will depend on the pattern you found. You can use the ones I found they match with your case.  
    • Old Flash Versions: Flash Version > matches regex > 11\.5\sr502|10\.3\sr183|13\.0\sr0
    • Hubspot provider: Service Provider > exactly matches > hubspot
  8. Set a meaningful name for the segment for example "0.All Sessions - No bots" and Save. All traffic matching this conditions will be excluded from your reports while the segment is selected.

Creating a segment with the 3a condition worked perfectly for one of my accounts, it removed most of the unnatural direct traffic (orange), and the clean data is normalized (blue).

You can add more sets of conditions to exclude other patterns by clicking on the "+Add filter" at the bottom.

Other less common causes of spikes in direct traffic

Bot traffic will cover most of the cases of unnatural direct traffic. Here are other less common reasons:

Using the referral exclusion list to block the fake traffic

The purpose of this list is not to filter traffic but to prevent some sites to show as a referral like payment getaways. However, there are many sites recommending to stop the spam using the Referral exclusion list. 

Adding fake referrals to this list will only strip the referral part of the visit and leave it as direct wich is even worse.

The solution for this issue is simple, just remove all the wrong entries from the list and instead follow this guide to get rid of the spam.

Ghost traffic wrongly done by the spammer

The Ghost technique is used by many spammers to send fake referrals(URL) to multiple Google Analytics accounts. Sometimes he forgets to set the source or URL in the visit so that traffic will show as "direct."

fake direct traffic ghost spam

To identify if you are dealing with this situation you simply have to check the hostname (What is a hostname?)

  • Reporting > Aquisition > All Channels > Direct > Select Hostname as a Secondary Dimension

If you see fake/unusual hostnames or shows as (not set), then it is coming from ghost spam, you will also notice that the metrics will show numbers like bounce rate 100% or Avg. Session time 0s.

To properly clean this fake traffic no matter if it shows as a referral or direct follow this guide.

Your opinion is important

Bots usually crawl multiple sites, and it's possible that other people are having the same issue as you. By sharing your experience and findings, you may help others :)

Author

Analytics Expert. Passionate about SEO and User Experience or what he calls UX-SEO

Follow me on
Be the first to comment :)