Digital Marketing Consultant and Web Analyst

Passionate about UX Design and SEO, or what he calls UX-SEO

Learn how to block all junk traffic and keep your analytics data clean

NOTE: This guide will help you to stop all hits from the most common sources of junk traffic. Following all the steps will ensure that you are receiving not only clean but also meaningful data for your Analytics.

However, the guide is built so each section can be done independently. If you are looking only for stopping and removing the spam go to sections 2a, 2b and 3

You can use the Quick Navigation   on the right side of your screen to quickly jump to the section you want.

Latest Referral Spam (LAST CheckED: October 28, 2016) / referral scanner-[name].top cookie-law-enforcement-**.xyz / compliance-[name].top

Latest Keyword / Page Spam

cdn eu cookie law share buttons
social share buttons share this button website buttons
floating share buttons share button generator  

* The filters in this guide are optimized so you won't have to create/update filters each time a new spammer shows up and will stop everything no matter if it shows as a referral, keyword or page.

Google Analytics is probably the most important tool you use to understand the traffic to your site and to make important decisions about your design and marketing options.

But, how accurate and meaningful is your data?

It is very easy to start tracking your visitors with Google Analytics, but like any tool, you need to tweak it to make sure it works as you expect, otherwise, the spam, robots or bots, internal traffic and other sources irrelevant traffic will corrupt your data and cloud the true performance of your site.

What do you need to have reliable and accurate data

In this guide, I will show you not only how to get rid of the spam and other junk traffic that corrupts your Analytics, but also how to do it safely so you don't risk your data and efficiently so you waste time updating/creating filters every time a new spammer shows up.

Here is what you will accomplish:

  1. Protect your data: How to correctly configure your views and protect your data from possible misconfigurations.
  2. Stop the junk traffic: How to create efficient filters to block:
    • Ghost and Crawler Spam
    • Good Bots
    • Internal traffic
  3. Clean your historical data: How to create an advanced segment to remove the spam from your historical data.

Don't have time to do this? Let the expert do it for you.

I can personally set everything you need to get rid of all past, current and any future ghost spammer that tries to hit your Google Analytics. Plus I can help you find and fix other areas to ensure you are receiving not only clean but also meaningful data.

Learn More

1. Protect your Data from misconfigurations

This step is not strictly required to clean your analytics, but it is important to protect it from misconfigurations. Every Google Analytics account should have the following views:

  • Master
  • Unfiltered
  • Test (Optional)

The unfiltered view, as it names states, shouldn't have any filter or any setting that alters the incoming data. That way you will always have a backup.

Additionally, if you want to be extra cautious you can create a test view where you will check your filters before applying them to the master view.

If you are already following these best practices go to the next step. If you haven't already and need help you can find here the instructions.

chevron_rightHow to create an unfiltered and a test view

2. Stop the spam and other junk in Google Analytics

Once you ensure that your data is protected, the next step is blocking all of that dirty traffic that inflates your reports, but before doing that, let's go over what you shouldn't do. If you've made any of the below-listed mistakes, revert the changes if possible.

Wrong ideas about the spam in Analytics

The Right Way of dealing with the spam.

These solutions have proven to be very effective. I've applied them to all my Analytics and to more than 400 sites that I have personally helped in the last year.

The following image is an example of one of those sites that implemented these solutions in November 2015, the same filters and settings that will create for your Analytics.

To get these results, you will need to create three filters and enable one built-in feature:

  1. Filter for Ghost Spam (will stop any form referral, organic, page)
  2. Filter for Crawler Spam.
  3.  Enable the built-in feature Bot Filtering (to exclude known bots)
  4. Filter for Internal traffic, containing all the IP's used by you or your team.

Let the expert do it for you. If you prefer, I can set up everything for you:

  1. Configure advanced filters to stop the spam and other junk traffic
  2. Create an advanced segment to clean your historical data affected by spam
  3. Review your Analytics settings and enhance them for better data collection.
  4. Protect your data by checking essential configurations of your views.

A couple of general notes about the filters

  • While most of the time filters start working in minutes,  officially it may take up to 24 hours before the filter effects become visible in your data, so be patient.
  • You will apply the filters either in the master view (the view(s) to be used for analysis) or the test view if you want to try them first.

a. How to Stop Ghost Spam in Google Analytics

What is ghost spam?

The main characteristic of ghost spam is that it never visits your site (is not a bot). Instead, it uses the measurement protocol to reach your GA directly. For that reason, this type of spam always leave a fake hostname or undefined usually seen as (not set).

If we use this to create a filter that will only let pass traffic with valid hostnames, all ghost traffic will be automatically excluded. This solution is much more efficient than the one commonly used, which is to create a filter with the name of spam. And not only will stop the fake referrals but any other form of ghost spam like keywords, events, and pages.


There is one thing I want to make clear because sometimes there is a bit of confusion. Some people mistake the hostname with the source. The latter is where your visit comes from, for example, Facebook, Google, any link from another site to your page, etc. (there are plenty of possible sources).

The hostname, on the other hand, is the site where the visitor arrives, it is something that you control and usually can by counted with your hands. The main one will be your domain and, depending on the configuration of your site, some others.

Source vs Hostname Google Analytics

To make it clearer, I will give you an example:

If we consider a visit that comes from Facebook to this article:

Facebook >>

The visit will be recorded in Google Analytics like this: 

  • Hostname:
  • Source (referral):

So as long as you add all your hostnames you don't have to worry, you won't exclude any real traffic.

This part of the guide may be the most complicated, but is also the most important to get rid of spam and it can even help to eliminate other sources of irrelevant traffic.

How to find your valid hostnames

The most important part of this method is getting a list of all your valid hostnames to avoid excluding any legitimate traffic.

  1. On the reporting section, select a wide time frame on the calendar, go to the Audience reports in the sidebar.
  2. Expand Technology and select Network
  3. Make sure you select Hostname at the top of the report (by default Service Provider is selected), you will see a report with all your hostnames (real and fake).How to find the hostname report - Google Analytics
  4. Make a list of all relevant host names you find. At least you see one that will be your primary domain. The rest depends on the configuration of your site and all the services that have added tracking code (UA-000000-1). Here are some examples:
    Your main domain: Subdomains:
    Translate services: Bing translate Shopping Carts: Shopify
    CDNs: CloudFlare Video Services: Youtube
    Cache Services: Google cache IP's:
    Payment Services: Paypal

    An invalid hostname is essentially any other that you do not know and controls. For example:

    - Hostnames with URLs pointing to the spammer website.
    - Known sites like or  (spammers use them to mislead people)
    - The most common hostname for spam (not set), this happens when the spammer doesn't even bother to set a fake hostname.

    The following screenshot is an example of a hostname report.

    Green: Valid Red: Spam

    Google Analytics Hostname Report

    From the report above we get the following valid hostnames (remember we are looking for real valid hostnames to include not to exclude)

Once you have the list of all your valid host names, the next step is to create an expression that contains all of them to add it to your filter. This is important because you can only have one include hostname filter.

Do you need help with the filters? 

If you need assistance, I can set up everything explained in the guide for you:

How to build your valid hostname filter expression

Once you gather all of your valid hostnames, you should create Regular Expression (REGEX) that matches all of them. Here are some tips to help you build your expression:

  • To separate each hostname, you should use a bar or pipe character |, this works as OR, if you can´t find it, hold Alt + 124(Numeric pad)
  • The dot . and the hyphen - are considered special characters in REGEX so you should add a backslash \ before them.
  • Try to find a good way to match as many hostnames as you can, for example, if you want to match,,, you don't need to add all of them to the expression entering ohow, will be enough to match all of them (just avoid using common names).
  • Don't leave any spaces.
  • The REGEX has a limit of 255 characters if your expression exceeds this limit try to optimize it to keep everything under one expression because you can only have 1 Include hostname filter
  • Don't add a pipe/bar |, at the beginning or the end of the expression.
  • More about Regular Expressions

Following the recommendations above and using the list of Valid hostnames from the example:

[optin-monster-shortcode id="oujoqy5t2vqebtks"]
List of hostnames found in the report

We built the expression that will match all of them:


It is important that you add all your relevant hostnames, or you run the risk of losing valid data. You can test your expression with a quick segment.

Once you are sure the expression is correct, it's time to create the filter.

How to create a valid hostname filter

To block all ghost spam in Google:

  1. Go to the Admin tab, and select the view where you want to apply the filter. If you follow the naming above, this will be the Master view or Test view.
  2. Select Filters under the View column and select + Add Filter
    Add filter button Google Anlaytics
  3. Enter a name for the Valid Hostname filter.
  4. In Filter Type, select Custom
  5. IMPORTANT: Make sure you choose Include (you may need to scroll down a little) and select Hostname from the dropdown.
    How to create a valid hostname filter - Google Analytics
  6. Copy and paste the hostname expression that you built into the Filter Pattern box. If you click on Verify this filter you will get a quick glance of how the filter will work. You should only see spam or irrelevant hostnames on the left side of the preview table.

    If you get the message below, is probably because of the limited data used by this feature

    This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small

    Try verifying it with a quick segment (if you haven't done it yet).

  7. After making sure your filter is ok Save the filter.

IMPORTANT: This filter will stop most of the spam and doesn't require updates for new spammers, but it's essential to update the expression whenever you add the tracking ID to new service/domains.

b. How to stop Crawler Spam in Google Analytics

What is crawler spam?

Crawler spam is harder to detect since it uses a valid hostname, so you'll need a different filter with an expression that matches all known crawler spam.

To save you some time, I've created an optimized REGEX for crawler spam that you'll find below in the instructions, but If you prefer, it can be built the same way as the valid hostname expression, but this time, you will use the source (referral) name.

How to create an exclude filter for crawler referral spam

The process to create a crawler filter is similar to the previous one.

  1. Go to the Admin  tab.
  2. Under the "View column", select Filters  and click + Add Filter Add filter button Google Anlaytics
  3. Enter "Crawler Spam" as a name.
  4. Filter Type > Custom > Exclude
  5. Filter Field > Campaign SourceHow to stop referrer spam in google analytics
  6. Filter Pattern > Paste the following crawler spam expression (If you want you can create your own expression similar to the valid hostname expression)

    The following expressions are optimized to block all crawler spam detected over the last couple of years.

    Create 1 filter for each expression

    # Expression 1


    # Expression 2


    You can get alerts whenever a new crawler is detected so you can keep the spam away. This alerts will contain the name of the crawler along with the updated expressions so you will only have to copy paste in your filters.

    Again you can click Verify this filter to have a quick glance of how the filter will work. You should only see referral spam on the left side of the preview table.

    Since crawler spam is less common, it is very likely that you will get the following message.

    This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small

    If you followed the guide carefully don't worry the filter will work. Alternatively, you can verify the filter with a quick segment.

  7. After everything is set Save.

Note: You may find other referrals that may not be spam, but neither relevant for you. For example, mobile test sites or cache sites. You can create a similar filter with the same configuration and add all the irrelevant referrals.

Block irrelevant traffic, Remove spam and Turn your Google Analytics Data into Actionable InsightsClick To Tweet

c. How to Exclude Internal Traffic

Internal traffic is the one generated by you or other people of your team. Often this type of traffic is not taken into account but can cause even more problems than spam because it won't have a source and its much harder to detect and remove later.

To create this filter you will need the public IP of the network/wifi you want to exclude. You can find it here. You will see something like this  12\.345\.678\.90

You can use the same filter to block multiple IPs by creating a regular expression. To do that you need to add a pipe or vertical bar | after each IP like this:

If you use a range of IPs, you can use this tool to help you build your IP regex.

Note: Dots are considered special character in REGEX, so strictly speaking you need to add a backslash before every dot like this 12\.345\.678\.90 , however, this filter will work without them.

Once you have your public IP or expression with multiple IPs you can create the filter.

How to create a filter for internal traffic

  1. In the Admin Section under the VIEW column, select the view where you want to apply the filter from the dropdown (master or test) and click on Filters
  2. On the filter window, select Custom as Filter Type.
  3. Under Exclude select IP Address as Filter Field.
  4. Copy your IP or IPs expression in the Filter Pattern box.
    how to exclude internal traffic in Google Analytics
  5. Enter "Exclude Internal Traffic IP" and Save.

You can't verify IP filters the same way the spam filters. To test your filter visit your site from the location you just excluded and check if you are not showing in the real time report, if you still see your visit, wait some time, it may take up to 24hrs.

If your filter doesn't work after this period, check if you enter the correct IP or if you have a misplaced or extra character in the expression.

d. Enable:"Exclude all hits from known bots and spiders"

There are many other crawlers around that are not spam but neither useful for your reports. For example, the ones crawling your site for indexing. This bots will leave a record in your reports if not excluded.

In this case, is a bit easier because Google Analytics has a built-in feature to exclude this traffic.

Selecting this option will exclude all hits that come from bots and spiders on the IAB know bots and spiders list. The backend will exclude hits matching the User Agents named in the list as though they were subject to a profile filter. This will allow you to identify the real number of visitors that are coming to your site.

It is important to mention that you won't be blocking the access these bots only preventing them for counting as a real visit.

How to enable bot filtering

Repeat the following steps for all your views

  1. Select one of your views under the VIEW column in the admin section.
  2. Click View Settings
    Bot Filtering in Google Analytics
  3. Near the bottom check the box Exclude all hits from known bots and spiders (Bot Filtering)
    Exclude all hits from known bots and spiders
  4. Save and repeat the process with all your Views

You can see the full details about Google Analytics Bots and Spider Filtering here.

3. Removing the spam from Google Analytics (Historical Data)

As I mentioned earlier, the spam that is already stored in your Analytics can't be permanently deleted. That is why it is important to create the filters first, to stop receiving junk traffic.

However, you can still clean your past data affected by spam by using the valid hostname expression you built previously and an advanced segment.

How to eliminate the spam from your Google Analytics

To remove the spam from your Google Analytics historical data

  1. In the Reporting section, click on the box that says All Users , next click the red button +NEW SEGMENT
    New Segment GA
  2. Scroll almost to the bottom of the Segment window and click Conditions
  3. Configure the First condition:
    • Filter >Sessions >Include
    • Field 1 > Hostname
    • Field 2 > matches regex
    • Field 3 > Paste the Hostname Expression that you previously used for the filter.
  4. Click +Add Filter at the bottom to add a new condition.
  5. Configure the Second Condition:
    • Filter >Sessions >Exclude
    • Field 1 > Source
    • Field 2 > matches regex
    • Field 3 > Paste the Crawler Spam expression

    The following expression is optimized to exclude all crawler spam detected over the last couple of years.


    You can get alerts whenever a new crawler is detected so you can keep the spam away. This alerts will contain the name of the crawler along with the updated expression so you will only have to copy paste in your filters.

  6. Enter "All Users - clean" as a name for the segment and Save.

After saving the segment you will be able to see spam free reports, as long as the segment is selected.

Eventually, the filters will do their work, and you won't need to use the segment.

Clean up the spam and Optimize your Google Analytics for reliable and meaningful dataClick To Tweet

Historical Spam List (Crawler and Ghost).

This list is frequently you can keep it as a reference in case you find any suspicious referral / keyword.

Historical Crawler Spam List / referral  

 List of latest Ghost Spam / referral
cdn scanner-[name].top compliance-***.top / referral \ / page tilte
Check full historical list of ghost spam

Latest Referral Spam (LAST CheckED: October 28, 2016) / referral scanner-[name].top cookie-law-enforcement-**.xyz / compliance-[name].top

Latest Keyword / Page Spam

cdn eu cookie law share buttons
social share buttons share this button website buttons
floating share buttons share button generator  

Ghost spam has a short life and often these fake sites get banned. For example, if you try to access (one of the spammers) you will be redirected to a "page not found" in


Google Analytics is a powerful tool that will help you understand your traffic data, but you have to prepare it to receive clean and meaningful data otherwise you might be pointed to the wrong direction.

Even on high volume websites were data spamming would be marginal, you still have to explain why there's such a discrepancy. As an analyst you can't dismiss it simply by saying "nah... we're not too sure what it is, but I heard about that spamming thing..."

-Stéphane Hamel

By applying these solutions you ensure that you will receive accurate data. Here is an example of one of the Accounts. The image shows a comparison of data with and without spam and other junk traffic.

remove referral spam google analytics

Security, SEO, rankings and other concerns

Because the article was getting too long, I divided the theoretical from the practical part. If have more questions or concerns check the FAQ I made about the spam in Google Analytics.

  • Does the spam harm my SEO-Rankings?
  • Is Google doing something to handle this threat?
  • How do it gets in your reports?

And many other answers and demonstrations.

Other ways of improving your Analytics

Cleaning up your analytics is an important part of getting healthy analytics, to get even better insights I recommend you to follow these best practices.

Do you have any Questions/Feedback?

I've tried to cover every important detail in the article, however, if there is any part of the guide where you got stuck let me know in the comments and I'll try to clarify it.

Spam hits thousands of Google Analytics users. If this article helped you, please consider sharing it or leaving a comment with your experience -- it may help other people :)


Digital Marketing Consultant and Web Analyst

Passionate about UX Design and SEO, or what he calls UX-SEO

Be the first to comment :)