[This article is part of a series entitled: GA Implementation ABCs]
During my first post of this series What is the 1st thing to do when considering a web analytics implementation?, I discussed how important simply getting initial data in was – before tackling the much wider (and also more complex) issues of mapping your stakeholders, building your KPIs or assessing your business needs from your web site. Essentially, my view is: get an initial feel for the project – get the data in and that means tag all your site pages (including the tracking of non-standard pages such as PDFs, EXE, ZIP etc).
With data coming in, the 2nd thing to consider is adding filters. Filters in Google Analytics have many purposes such as segmentation and report augmentation. In this post I focus on their role in data cleansing. Keeping the data ‘clean’ means removing visits that are not wanted or are not valid visits. Essentially considering these as improving the signal-to-noise ratio of your data. Having clear signals means you don’t waste time analysing what could be random events (noise) on your web site.
Example cleansing filters include:
- Your own access to your web site
– this can be a significant volume of non-converting traffic if your employees set their browser opening page to be the company web site. Such visits will over inflate your visitor and pageview counts and decrease your conversion rates.
- Your web developers/designers updating content
– these can be significant in volume but more importantly, web developers are likely to update conversion pages, triggering goals and over inflating your conversion rate metrics.
- Data contamination
– other web sites copying your GATC either deliberately of accidentally which results in meaningless data being mixed with your web site visit data.
All 3 of these should be removed by adding 3 filters to your GA configuration as follows:
Filter 1: removing yourself from the reports
Excluding known visitors is very straight forward. If visitors connect to the Internet via a fixed IP address, you simply select the predefined filter ‘Exclude All Traffic from an IP Address’ from the Filter Manager as shown:
Excluding visits from employees, your search marking agency or any known third party, such as your web developers, is an important step when first creating your profiles. These visitors generate a relatively high number of pageviews in areas that will greatly impact key metrics – such as your conversion rates. For example, employees with their browser home page set to the company web site will show in your reports as retuning visitor every time they open their browser – and most likely a one-page visitor. Remember the GATC deliberately breaks through any caching so it’s important to exclude employees from visits from potential customers.
Similarly web developers heavily test checkout systems for troubleshooting purposes. These will also trigger GATC page requests and most likely these will be for your goal conversion pages. You should therefore remove all such visits from your reports.
Filter 2: removing your designers/developers from the reports
This simply an extension of Filter 1, using the ip address of your agency in place of your own office. But what if ip addresses change each time they log in? I will discuss this scenario in a later post. However, for the vast majority of business broadband lines, fixed ip addresses are used, so you should be ok.
Filter 3: removing any contaminated data
This filter is to ensure that your data, and only your data, is collected into your Google Analytics profile. For example, it is possible for another web site owner to copy your GATC onto their own pages – therefore contaminating your data with their own web site traffic. The simple include filter shown below applied to your Google Analytics profile will ensure only traffic to the mysite.com domain is reported on.
Of course it may be desirable to collect data from multiple web sites into one profile. In that case, add the multiple domains in the Filter Pattern separated with a | character, for example:
Filter Patern: mysite\.com|yoursite\.com
It is important to note that when a filter is created within Google Analytics, it’s immediately applied to new data coming into your account. New filters will not affect historical data, and it is not possible to reprocess your old data through the new filter. Therefore, always keep “raw” data intact – that is, keep your original web site profile and apply new filters to a duplicate profile in your account.
How have you approached the signal-to-noise ratio problem? The vast majority of Google Analytics installations I come across have no filters applied, why is this? Please add your thoughts with a comment.