What is the 2nd thing to do when considering a web analytics implementation?

What came first?

[This article is part of a series entitled: GA Implementation ABCs]

During my first post of this series What is the 1st thing to do when considering a web analytics implementation?, I discussed how important simply getting initial data in was – before tackling the much wider (and also more complex) issues of mapping your stakeholders, building your KPIs or assessing your business needs from your web site. Essentially, my view is: get an initial feel for the project – get the data in and that means tag all your site pages (including the tracking of non-standard pages such as PDFs, EXE, ZIP etc).

With data coming in, the 2nd thing to consider is adding filters. Filters in Google Analytics have many purposes such as segmentation and report augmentation. In this post I focus on their role in data cleansing. Keeping the data ‘clean’ means removing visits that are not wanted or are not valid visits. Essentially considering these as improving the signal-to-noise ratio of your data. Having clear signals means you don’t waste time analysing what could be random events (noise) on your web site.

Example cleansing filters include:

  1. Your own access to your web site
    – this can be a significant volume of non-converting traffic if your employees set their browser opening page to be the company web site. Such visits will over inflate your visitor and pageview counts and decrease your conversion rates.
  2. Your web developers/designers updating content
    – these can be significant in volume but more importantly, web developers are likely to update conversion pages, triggering goals and over inflating your conversion rate metrics.
  3. Data contamination
    – other web sites copying your GATC either deliberately of accidentally which results in meaningless data being mixed with your web site visit data.

All 3 of these should be removed by adding 3 filters to your GA configuration as follows:

Filter 1: removing yourself from the reports

Excluding known visitors is very straight forward. If visitors connect to the Internet via a fixed IP address, you simply select the predefined filter ‘Exclude All Traffic from an IP Address’ from the Filter Manager as shown:

filter to exclude an ip address from Google Analytics

Excluding visits from employees, your search marking agency or any known third party, such as your web developers, is an important step when first creating your profiles. These visitors generate a relatively high number of pageviews in areas that will greatly impact key metrics – such as your conversion rates. For example, employees with their browser home page set to the company web site will show in your reports as retuning visitor every time they open their browser – and most likely a one-page visitor. Remember the GATC deliberately breaks through any caching so it’s important to exclude employees from visits from potential customers.

Similarly web developers heavily test checkout systems for troubleshooting purposes. These will also trigger GATC page requests and most likely these will be for your goal conversion pages. You should therefore remove all such visits from your reports.

Filter 2: removing your designers/developers from the reports

This simply an extension of Filter 1, using the ip address of your agency in place of your own office. But what if ip addresses change each time they log in? I will discuss this scenario in a later post. However, for the vast majority of business broadband lines, fixed ip addresses are used, so you should be ok.

Filter 3: removing any contaminated data
This filter is to ensure that your data, and only your data, is collected into your Google Analytics profile. For example, it is possible for another web site owner to copy your GATC onto their own pages – therefore contaminating your data with their own web site traffic. The simple include filter shown below applied to your Google Analytics profile will ensure only traffic to the mysite.com domain is reported on.

filter to include only your own web site traffic in Google Analytics

Of course it may be desirable to collect data from multiple web sites into one profile. In that case, add the multiple domains in the Filter Pattern separated with a | character, for example:

Filter Patern: mysite\.com|yoursite\.com

Important tip:
It is important to note that when a filter is created within Google Analytics, it’s immediately applied to new data coming into your account. New filters will not affect historical data, and it is not possible to reprocess your old data through the new filter. Therefore, always keep “raw” data intact – that is, keep your original web site profile and apply new filters to a duplicate profile in your account.

How have you approached the signal-to-noise ratio problem? The vast majority of Google Analytics installations I come across have no filters applied, why is this? Please add your thoughts with a comment.

Looking for a keynote speaker, or wish to hire Brian…?

If you are an organisation wishing to hire me and my team, please view the Contact page. I am based in Sweden and advise organisations in Europe as well as North America.

You May Also Like…

8 Comments

  1. Santiago

    Hi, I think I found a mistake in filter 3.

    It does not work. Because it does not limit data to that domain.

    The correct way should be something like: “exclude everything, but data from this site”

    Although I don’t know if this is possible

    Reply
  2. Jack the Dallas Handyman

    I’m just a newbie in the field of Web Analytics. I think reading your post is a good way to start. I’m also considering buying some of your books and take it from there.

    Reply
  3. A.Hariri

    Good points. I am just trying to start my way through the Analytic world and I found your articles helpful. I have lately bought your Advanced Web Metrics Book but I haven’t got the time to read yet as I am in the middle of my dissertation.

    Thank you!

    Reply
  4. Sandra

    I have a beginner´s question: I would like to see the statistics of subcategories of my website, eg. if the website is http://www.xyz.com I would like to filter http://www.xyz.com/bikes/. As far as I see from the Google Analytics Helpsite, I need to set up a new profile in order to keep both, the stats for the whole site as well as the subcategorie. This means I have to insert two Tracking Code is the ../bikes/ pages. I wonder if this is the simplest way to do it. I have about 10 subcategories I would like to track. Thanks for any advice, Sandra

    Reply
  5. Brian Clifton

    Sebastien: OK fixed with a simple \ to escape it.

    Thanks for the heads up.

    Reply
  6. Brian Clifton

    Sebastien: Yes you are correct but for some reason wordpress is not showing the ‘\’ character.

    Reply
  7. Sébastien Brodeur

    The last example of filter should not be: mysite\.com|yoursite\.com ?

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share This