I often receive questions about Urchin – what it is (typically: Is it the commercial version of GA?) how it compares with Google Analytics, and how to choose between the two. This post, an abstract from the latest version of the book, explains what Urchin is, its relationship to Google Analytics and why, if at all, you need to consider it.
Urchin Software Inc. is the company and technology that Google acquired in April 2005 and went on to become Google Analytics—a free web analytics service that uses the resources at Google (I explain more about its history in Google Analytics – Fours Years on). Urchin software is a downloadable web analytics program that runs on a local server (Unix or Windows).
Typically, this is the same machine as your web server. The Urchin Software creates reports by processing your web server logfiles (including hybrid logfiles) and is commonly referred to as server-side web analytics. Example screenshots of Urchin Software (version 6) are shown in Figures 1 and 2.
Figure 1: Urchin 6 initial configuration screen
Figure 2: Urchin 6 visitor report showing A) individual (anonymous) visitor information, B) visits by date, and C) path and resultant transaction information for a particular visit .
Urchin is essentially the same technology as Google Analytics—the difference when using Urchin is that your organization needs to provide the resources for log storage and data processing. Urchin can provide complementary reports that Google Analytics currently does not (or cannot because of its methodology). Let’s look at some examples:
- Visitor history report
Tracking individual visitors enables you to view the path a visitor takes through your website as well as their referral information. For privacy reasons Google has deliberately taken the decision not to track individuals with Google Analytics. However, with the data collection and processing under your control, you have the freedom to do this with Urchin. Each visitor is tracked anonymously.
- Error page and status code reports
More than just reporting completed page views (as is the case for Google Analytics), Urchin can report partial downloads and any error code. It is possible to configure your website to report error pages within Google Analytics. However, Urchin software reports on errors out of the box because your web server tracks these by default.
- Bandwidth reports
Reporting on bandwidth allows you to view how “heavy” your pages are and how this impacts the visitor’s experience.
- Login reports
If your website has a login area, you can report on this access by username. This supports standard Apache (.htaccess) or any authentication that logs usernames in the logfile.
Differences between Google Analytics and Urchin
With two analytics products from Google to choose from, how do you determine which one of these is right for your organization? As you may have guessed, Google Analytics is perfect for most organizations, for two very simple reasons:
- Google Analytics is a free service
This is generally considered a major benefit for small and medium-size organizations where budgets for analysis are tight. Urchin software is a licensed product and therefore must be purchased (currently $2,995 per installation).
- Google Analytics handles a large part of the IT overhead
That is, Google conducts the data collection, storage, program maintenance, and upgrades for you. This is generally considered a major benefit for large organizations where web analytics is a priority for the Marketing department and less so for the IT department. If your organization is using Urchin software, it is responsible for the IT overhead. Hence, good interdepartmental communication (IT and Marketing) is required.
The second point is not trivial. In fact, in my experience, the IT overhead of implementing tools was the main reason why web analytics remained a niche industry for such a large part of its existence. Maintaining your own logfiles has an overhead, mainly because web server logfiles get very large, very quickly. As a guide, every 1,000 visits produce approximately 4 MB of log info. Therefore, 10,000 visits per month are approximately 500 MB per year. If you have 100,000 visits per month, that’s 5 GB per year, and so on. Those are just estimates—for your own site, these could easily double. At the end of the day, managing large logfiles isn’t something your IT department gets excited about.
Urchin also requires disk space for its processed data (stored in a proprietary database). Though this will always be a smaller size than the raw collected numbers, storing and archiving all this information is an important task because if you run out of disk space, you risk file or database corruption from disk-write errors. This kind of file corruption is almost impossible to recover from.
As an aside, if you maintain your own visitor data logfiles, the security and privacy of collected information (your visitors) also become your responsibility.
Why, then, might you consider Urchin software at all?
Urchin software does have some real advantages over Google Analytics. For example, data is recorded and stored by your web server, rather than streamed to Google, which means the following:
- Data processing and reprocessing
Urchin can process data as and when you wish, for example, on the hour, every hour. You can also reprocess data—to apply a filter retroactively or to correct a filter error. Google Analytics reports are three to four hours in arrears and cannot reprocess data retroactively (in my opinion, the benefit of reprocessing data is the strongest advantage of Urchin).
- Unlimited data storage
Urchin can keep and view data for as long as you wish. Google Analytics currently commits to keeping data for a maximum of 25 months, though to date, Google has made no attempt to remove data older than this—see Figure 3.1.
- Third-party auditing
Urchin allows your data to be audited by an independent third party. This is usually important for publishers who sell advertising space on their site, where auditing is required to verify visitor numbers and provide credibility for advertisers (trust in their rate card). Google Analytics does not pass data to third parties.
- Intranets and firewalls
Urchin works behind the firewall; that is, it’s suitable for intranets. Google Analytics page tags cannot run behind a closed firewall.
- Database access
Urchin stores data locally in a proprietary database and includes tools that can be used to access the raw data outside a web browser, allowing you to run ad hoc queries. Google Analytics stores data in remote locations within Google datacenters around the world in proprietary databases and does not provide direct access to the raw data for ad hoc queries. That said, the Google Analytics API does allow you to query your processed data.
Note: Urchin is sold and supported exclusively through a network of Urchin Software Authorized Consultants. For a full list of USACs, see www.google.com/urchin/usac.html.
When it comes to adoption numbers, Urchin is certainly the smaller sibling compared to GA. However, when appropriate I recommend using both side by side.
Many people first come into contact with Urchin from their ISP/hosting account. Are you an Urchin user? Are you on version 5 (2004) or version 6 (2008)? I would love to hear your feedback. I am planning the next post on Urchin to discuss the criteria for selecting GA v Urchin. If that interests you, let me know – it motivates me to write it…!
That said, after a year and a half of use, I have to say I haven’t regretted the decision. One of the biggest benefits of Urchin is to be able to process reports on-demand, for as far back as we have log files.
For instance, just recently we ran in to our site being slammed by traffic from Google Images. While I am normally very against filtering traffic, the volume was such that it was making any sort of analysis of the targeted audience for the site impossible.
This had started a month before anyone really realized what was going on. With Google Analytics, I could have put on a filter, but I would be stuck with that month being out of whack.
With Urchin, I was able to put on a highly specific filter (all traffic from Google Images that have “tsunami” in the URI) and reprocess that month. Report corrected not just from this point on, but from before the problem started.
That is the first time it has happened, but it was great to know that I could fix my reports with 30 minutes of work, instead of having to tell my colleagues that they were just going to have to live with that month.
The major drawbacks of Urchin were that the interface was ugly and unfriendly, and important information was scattered in a handful of different pages within a report.
But now, with them opening up an API for Urchin and a little coding on my end, I can give my coworkers a more attractive interface with all of the information in one place.
So for us, Urchin is really the perfect solution to web stats. Even if my bosses came to me today and told me we could start using Google Analytics, I would tell them we don’t need to.
Sean: Thanks for the great insight. Now that you are allowed to set first party cookies I would recommend setting up Urchin with the UTM method i.e. hybrid (logfile + cookies). Logfiles on their own are incredibly inaccurate. The accuracy whitepaper has more details on this.
If you are in two minds of GA versus Urchin, you can always use both side by side: http://www.advanced-web-metrics.com/blog/2007/10/17/backup-your-ga-data-locally/
Sean is referring to the following announcement:
From @brianclifton: Big news for the world of Google Analytics: GA Officially deemed Secure Enough to Get US Federal Government Approval http://bit.ly/cgrnrj
Some webmaster integrate Google Analytics into their own management console that they coded themselves. Is this harder to do with Urchin?
intranet software: I wouldn’t describe it as easy or hard – just different…
How to get values in the username report in Urchin. This report is in Domains and Users section. My report doesnot show any values in this. All other reports are working fine.
RM/GAC: You need to contact an Urchin Software Authorised Consultant (USAC). List is at: http://www.google.com/urchin/usac.html
If you wish to use my company, we are listed as GA-Experts.com
Can you please help me in Configuring Urchin6 for a intranet applications. My Application is using Apache Tomcat 6.0.10. I have deployed the application and its running and I want to know how to configure this with the Urchin6.
Please explain me in step by step, it will be of much help for beginners like me.
Wendell: Thanks for the Epik link. I know those guys well and they are experts in this area.
Hannah: As Wendell points out (comment #3), you need IT support to run and manage an Urchin server. If you have that as a small business, then great. Otherwise, I recommend GA.
Brent: Yes, the power of Urchin is that it can be used in Hybrid mode – that is logfile + page tag to give you the best of both methodologies. See this article from me on the advantages: http://www.advanced-web-metrics.com/blog/2007/10/07/hosted-v-software-v-hybrid-tools/
However, please note that accuracy is not a function of the tool itself – either GA, Urchin, Omniture, WebTrends or any other tool. If there is an issue of numbers not adding up, then that is an implementation problem that needs addressing. For more information on accuracy issues of web analytics, have a look at the Accuracy Whitepaper – http://www.advanced-web-metrics.com/blog/2008/02/16/accuracy-whitepaper/
Also, for a comparison of vendor metrics (GA, Yahoo, Nielsen) with equal best practice implementations, see: http://www.advanced-web-metrics.com/blog/2008/12/16/web-analytics-accuracy-comparing-google-analytics-yahoo-web-analytics-and-nielsen-sitecensus/
We have recently moved away from GA because it is simply not accurate enough (specifically, were unable to get it accurately capture revenue), but we looked at running Urchin in parallel with GA so we would get the best of both worlds, and comparing the numbers gives you some idea of how accurate they are. I think running both is a good option for a lot of companies besides the auditability as you get all the reports that GA offers, but if you want to get at more detailed data (that the GA API does not offer) you can do that.
For those that are interested, EpikOne (an Urchin Software Authorized Consultant) has written a great article about typical server requirements for Urchin 6. You can read it here:
I’ve been using GA for a year now, I haven’t had the chance to really explore all it’s feature. I guess I’m contented with what GA could provide as a small business owner … but this Urchin which I’ve been hearing for quite a while now really interest me. I’m not an analyst though but your article led me to think how valuable to have a licensed analytics such as urchin. Quite expensive though but will definitely look into this.
This is a great post and it answers a lot of questions I had about Urchin. I definitely would like to read a follow up post on the criteria for selecting GA or Urchin.
The issue that concerns me with regards to Urchin is the huge amount of data that it creates. I have a client that could use the Urchin tool instead of Google but what makes it prohibitive to us is the IT requirements and the amount of data that will be created. Is there a resource that you would recommend online that can detail the ideal server configuration and requirements for Urchin?
Thanks in advance Brian.
Brad: Take a look at gaforflash at http://code.google.com/apis/analytics/docs/tracking/flashTrackingIntro.html. This is the way to track distributed Flash videos.
I can’t see how you can dedupe visits from different domains unless you request a login from the user and this is matched for all domains. The only other method would be to use a system with a third party cookie – not recommended for best practice privacy reasons. BTW, GA and Urchin only use first party cookies.
Nice article. I’m a longtime GA user, but it has a big shortcoming that I can’t seem to get around. Our sites have a lot of videos (Flash format, in our own player, not YouTube). People can watch them on our sites or embed them elsewhere. When they are embedded on other sites, our tracking with GA is very limited. I would like to know on which domains or URLs our embedded videos are being watched, as well as the unduplicated unique visitor count (visitors to our domain as well as people watching our videos on other domains). Would Urchin be able to do this?