Is There a Future for Web Analytics?

In 2008 as I was walking out of the door at Google, I predicted the next big thing in web analytics was privacy. Of course I was wrong, but it did come eventually in the form of the GDPR. And thank God for that – the wild west of the intrusive surveillance economy was rapidly getting out of hand.

In October this year, Juan Mendoza – the excellent essayist behind the The Martech Weekly, wrote an opinion piece that inspired this post. Here is an extract from Juan’s thought provoking article:

“The challenge with web analytics is that it’s stuck in a crossfire between regulators and the companies that have come to define how the internet works. The EU government is on a crusade to implement consent banners for every website, and change the paradigm for what consumers should know about online tracking.”

I agree, the data industry I work in is on a knife edge. However, I do think Juan is wrong in blaming the EU for consent banners. Nobody likes or wants them – not even strong privacy advocates such as myself who have built their career as a data practitioner. Apart from the horrible UX, no one ever reads them, which of course defeats the purpose of protecting consumers and giving them an informed choice. But in fact, nowhere in the GDPR or the ePrivacy Directive does it say a site must have a consent banner.

Where did we go wrong?

The data industry has lost trust – Like dirty water, users do not want it

The consent issue could and should have been solved years ago by browser developers, or OS engineers. Google is a leader in both of these areas and clearly has a vested interest in pushing back against privacy regulation – which they do.

The online ad industry is worth a quarter of a trillion dollars to Google. So of course there has been little to no thought given to a user being able to manage their privacy in a smart way via their browser or operating system i.e. centrally managed by the user, rather than repeatedly asked for per site. It takes strong regulation to force that to happen.

And don’t get me started on why Google’s consent mode is at best unethical and at worst breaks privacy laws. In a nutshell, a user’s explicit response to “no to tracking” is it reinterpreted by Google as “yes to less invasive tracking”. Oh, and you just have to trust Google that the slightly different way of tracking you is actually something you are comfortable with. Its opaque.

No means No Google. Anything else erodes trust.

Google’s Topics API is an interesting contender for reducing tracking invasiveness and on the surface I like it. But it should have been developed 10 years ago! And that is the problem – it’s now too late in my opinion. The next generation of consumers are now much more savvy about data harvesting, their privacy rights, and the fact that if something is free, they are the product. Given a binary choice of Accept all, or Reject all, why would any user opt-in to being tracked by any method?

Juan summarised the situation succinctly – “We had our moment and blew it“. The trust relationship of web/app analytics is poisoned. Like dirty water, it’s impossible to clean – certainly to the point of people willing to drink it again.

Basic questions are still a struggle in 2024

Another issue is that despite huge improvements in technology over the past decade, digital marketers still struggle to apply advanced web metrics. Incredibly, its been 150 years since this quote: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half”. Yet Marketers will still complain the situation has hardly improved.

On a global scale that is billions of dollars wasted each year.

However, it is a myth that web analytics is only valuable if you profile individuals i.e. track everything that a single person does on your site (first party data) and where they go online before/afterwards (3rd party data). It didn’t always look like that. Benign data does exist – and it is just as valuable imo.

I first wrote about this in 2019. Marketers (and other stakeholders) primarily need to know that campaign_X brought visitors to their website. That on visit N a certain number of visitors got to step 1 of the conversion funnel, step 2 on visit N+1 and so on. They need to be able to assess if campaign_Y performs better, and if both campaigns run together is there a 2+2=5 effect.

An analyst can go into great detail in answering these questions without visitors’ individual details. It is benign data at the point of collection – because it’s aggregated data. All the individual identifiers of a tracking pixel can be dropped i.e. not collected in the first place, or scrubbed during processing and you still get to answer these key questions.

Advanced web metrics is about doing the basics really well and applying them in a clever way

To me, advanced web analytics is about doing the basics really well and applying them in a clever way. I argue that if done well, such analysis of aggregate data can significantly push the needle down on wasted advertising – way below 50% – and WITHOUT the need for an annoying pop-up consent banner.

The French data protection watchdog (CNIL) agrees and recently Spanish AEPD came to the same conclusion (and Latvia). However, the problem I see is that agencies and businesses have become so blinded with Google, Facebook and others’s obsession with personalised tracking and targeting, that they have forgotten the basics.

Of course, individual customer data is still needed – when they purchase or provide contact details. At that point, it is perfectly legitimate to track an individual user – now your customer – as a first party data subject. Data protection is still applicable for these data subjects. However the guidance from CNIL/AEPD et al, is that assuming you limit the purpose of this individual data to serving your customers needs and running your business (e.g. not passing to third parties for re-marketing), no annoying pop-up consent banner is required.

What is benign data collection?

I mentioned earlier benign data. It can be hard to visualise, so I like to use offline examples to illustrate digital practices…

Say I want to measure the road traffic near my children’s school for safety reasons. Over a one hour period, I count and record the number of vehicles passing the school gates; I have equipment to measure their speed; I note the type of vehicle (car, bus, truck etc.); I note if the vehicle is actually dropping off a child; I include other meta data such as weather conditions, visibility etc.

Those are all examples of aggregate and benign data. No clever technology, reverse engineering, or smoke and magic can ever discern who the owner/driver of each vehicle is, where they came from, or where they went to next. My safety spreadsheet contains valuable information to help school and traffic planners without any concerns of privacy.

A business cannot survive without data

The above school study is a great example of data collection for public good. However, businesses also need such benign data in order to stay in business. For example, knowing how many visitors come to a site, time of day, how long they stay, what pages are viewed, what campaign or search query did they click to arrive on the site etc.

These types of metrics are essential for any business – they would not survive for very long without them. The basics of growing a successful business beyond a small number of customers, relies on being able to predict stock levels, staff requirements, opening times, potential interest, being able to recognise an existing customer, and so on. Consumers expect and even demand that businesses are on top of this type of fundamental and essential data. And people’s job security rely on it.

What is invasive data collection?

HOWEVER, if as part of the same study I also collected license plate details of each vehicle – a unique fingerprint of car ownership details – my data now becomes personal. Even if I apply a cryptographic hash of the number plates so they are not human readable, the hash remains a unique identifier and one that is very hard to change (how often do you change your car?)

Now imagine every road on the planet has a monitor that is hashing the license plate of every vehicle that passes. All you need to do is compare the hashes and you have tracked every movement of a person. And as soon as that person logs in to a service – their Google or Facebook account for example, that service knows who that hash belongs to.

Cookies on steroids

Data hashing is a tool for security, not privacy. Using a unique hash of personal details for tracking visitors is analogous to cookies on steroids. But accepting and deleting cookies is something the user has complete control over. However, if you hash a user’s email address, house address, telephone number, or license plate, that information rarely changes. There is also no mechanism for the user to delete it.

For example, if I purchase an item from site_1, then on another day purchase a different item from an unrelated site_2, the tracker collecting and hashing my purchaser details can match those hashes to me. Google’s Enhanced Conversions and Facebook’s Enhanced Match does exactly this.

What can we do to improve?

If only we could move back to benign data collection. Pop-up consent banners would largely disappear, consumer trust could be rebuilt, and marketing would get better (less money wasted). There really can be a bright future for web analytics.

But…

We as an industry have become so entrenched in tracking the hell out of individual people in order to target them for personalised ads, we cannot see the wood for the trees. Yet contextual ads – the stuff ironically Google pioneered in the early 2000s – have been shown to be just as effective, if not better, than targeted ads.

If only we as an industry can collectively recognise the cliff edge that is right next to us, and move to the safer ground where users and consumers want us to be. As I ponder the future of web/app analytics, I give us a 50% chance of success.

This post was first published on LinkedIn, Jan 2024

Looking for a keynote speaker, or wish to hire Brian…?

If you are an organisation wishing to hire me and my team, please view the Contact page. I am based in Sweden and advise organisations in Europe as well as North America.

You May Also Like…

Sayonara Universal Analytics

Sayonara Universal Analytics

My first Google Analytics data point was 15th May 2005. In this post I look back at what Universal Analytics did for the industry, including my own career, where GA4 fits in (or not), and what I think the future holds.

11 years ago and a guy called Max Schrems

11 years ago and a guy called Max Schrems

11 years ago a guy called Max Schrems burst on to the scene. I thought then “At last, online privacy will be taken seriously. The industry I have worked in is at a tipping point”…

European Google Analytics Alternatives – #2 Jigsaws and Device Fingerprints

European Google Analytics Alternatives – #2 Jigsaws and Device Fingerprints

All collected data contains potential device identifiers – and by extension, user identifiers. The visitor’s IP address is an obvious one. However, there are many others that may seem benign at first, but when enough of them are stitched together, can be used like a jigsaw to build up a picture of the visitor.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share This