The rise and rise of “not provided” keywords

If you are active with search engine optimisation (SEO), then you will be aware of the issue of “not provided” showing in your Google Analytics reports for organic visits. To quickly summarise, in October 2011 Google implemented a change to how searches performed on their web properties can be tracked by website owners receiving the subsequent click-through traffic (see original post).

Essentially, the change was that if a visitor is logged in to their Google account when performing a search (for example, logged into their GMail or any other Google service), Google takes this as a signal to request privacy – and therefore encrypts the session via SSL. The result is that when a visitor clicks on an organic result, no referral detail is passed to the receiving website i.e. the keyword information is lost and not provided shows in your Google Analytics reports – see technical note at the end of this post for more information.

Note: This effects ALL web analytics tracking tools – including Google Analytics.

Last updated 05-Aug-2013:
– Collected data up to 31-Jul-2013
– Added new data source (Global charity)
– Note the convergence at 46% for the control group (Figure 3)
– Now three inflection points are showing (i-1, i-2, i-3 of Figure 3).

The following chart plots the scale and growth of the “not provided” issue for 10 English language websites I work with. The total organic traffic analysed over the period is ~120 million visits. The legend is in order – to match the graphed lines.

Figure 1 – Percent of organic visits with “not provided” set

not-provided.png

Figure 1 indicates there are 3 different zones – corresponding to 3 different types of users. I label these as A, B and C on Figure 2

Figure 2 – same as above chart with three zones identified
not-provided.png

  • Zone A (>70%):
    I consider zone A an outlier. It represents this website, which we know is targeted to Google users. Given that, I am surprised it is not 100%. That said, it does not represent other non-Google specific websites (yet!).
  • Zone B (50-70%):
    I consider these to be tech-savvy users that are most probably logged into a Google service, or search directly via the browser’s omnibox.
  • Zone C (<50%):
    This is the inverse of Zone B i.e. non-tech-savvy users.

Figure 3 – adding a “control” to the data
not-provided.png

In Figure 3, I add data from two highly respected universities – represented by the gold lines for US and UK (the higher historical line is the US university). For these universities, I am assuming visitors to these websites are a broad mix of tech savvy users that are most probably logged into a Google service (or search directly via the browser’s omnibox), and non-tech-savvy users (the inverse group).

The golden lines fit nicely between zones B and C and hence I use this as my benchmark. Figure 3 shows that above the golden line(s), the audience can be described as more tech-savvy, while below it they are less tech savvy. At present (31-July-2013), the “not provided” benchmarks have converged at 46% for US focused and UK focused websites respectively.

The inflection points (i-1 and i-2) correspond to the launch of “not provided”in the US and rest-of-world respectively.

The Latest Changes To Affect “not provided”

Clearly loosing the keywords visitors use to find your website is a big loss to any digital marketer. And although +Matt Cutts originally commented the change would only effect a small proportion of your traffic, clearly this was going to increase over time. After all, logging into a Google service is exactly what Google would like all its users to do…

Now the browsers themselves are also impacting “not provided”. In July 2012 Firefox announced a switch to SSL for all Google searches, and Safari follwed in September 2012. On January 18th, Google announced the same change. That is, even if the visitor is not logged into a Google service, the Chrome omnibox uses SSL for the visitor’s session. So another step to seeing organic keyword detail all but disappear form your traffic reports. SEO is definitely getting harder…!

The inflection point (i-3) corresponds to Google’s change for the Chrome browser.

Figure 4 – Proportion of global web traffic by browser type

browser-share

Technical Side Note

Although the SSL protocol strips ALL referrer detail from http headers, the referring domain can be retrieved by browsers that support meta referrer (currently only Chrome). That means, https://www.google.com/ will show as the referrer for example, though the query parameter containing the search term is not available. To cater for browsers that do not support meta referrer (currently IE, Safari and Firefox), Google redirects the visitor through a http referrer with the visit search terms removed (q parameter is null).

Privacy Side Note:

What I find odd (and disconcerting) is Google’s approach to AdWords traffic. That is, Adwords visits to your website are not affected – you still receive the keyword detail. It is strange to me that Google considers privacy important for organic searches, but not for paid searches.

Thanks to…

+Per Pettersson for spotting the recent Chrome change and +David Vallejo for his excellent technical support. Prof. Ingemar Cox and Dan Jackson of University College London for sharing their GA data.

Looking for a keynote speaker, or wish to hire Brian…?

If you are an organisation wishing to hire me and my team, please view the Contact page. I am based in Sweden and advise organisations in Europe as well as North America.

You May Also Like…

SEO and Analytics – part 2

Search Engine Optimisation (SEO) has been a part of my background since starting my career in the digital industry...

31 Comments

  1. Seolab

    Hi Brian,

    is the ssl verification still a valid method to keep “not provided” out of my analytics data? Because this article is about 6 months old and i saw that Google announced that they will aim for a 100% not provided status. Thanks for your kind reply.

    Reply
    • Brian Clifton

      @seolab – using ssl on your site makes no difference as Google uses a redirect to strip out the keywords. See the “Technical Side Note” under the last chart.

      BTW, last update was in August. Next update is planned in Nov. Look out for tweets form me on this also…

      Reply
  2. Antti Nylund

    Does anyone know do you get keyword data if you have redirect from http to https in between?

    Reply
    • Brian Clifton

      You would expect that to work but unfortunately not. The redirect G has in place removes the kw data prior to sending to your site.

      Reply
  3. Greg Moore

    “Privacy” is a red herring.

    Not providing search phrases makes it more difficult to have organic pages that provide what the user is looking for – because you don’t know what the user is looking for.

    Giving search phrases to AdWords customers allows them to optimize based on query intent.

    Inexorably, organic search results become worse, while AdWords landing pages become better. Inexorably, people sense this and click on the ads more often.

    Privacy, smivacy. They don’t care about that. That’s why AdWords customers get the queries.

    Google claims to put users first, but in reality they want to increase paid clicks, even if that means organic results degrade.

    Google’s “Analytics Advocates” take care to say they are not “Google Analytics Advocates” but true “Analytics Advocates,” yet all they propose are (mostly worthless) work arounds. Maybe I missed it, but I’ve yet to hear them do any “Advocating” on this issue. They must be too busy having free lunches and being “thought leaders.”

    As the lines in your charts keep going up and to the right, one thing becomes increasingly important: A Search Box on every page.

    Reply
    • Brian Clifton

      Thanks for related post Benolt. Just to double emphasise the point you make in your post:

      You must assume the following hypothesis (perfectly valid to me):

      The share of brand awareness visits from all visits generated by “not provided” keywords is identical to the share of brand awareness visits from all visits generated by “provided” keywords.

      Reply
    • Brian Clifton

      @Roald – yep, Dan’s hack can be useful but the problem still remains (i.e. association is not causality)

      Reply
  4. Brian Clifton

    @Roald – nice tip, though I am not aware of an profile filters that can do this for you in GA. You would have to do this client side.

    Reply
  5. Roald Verheijdt

    Great post and good to know that it’s the ssl and not something that only effects GA. No need to switch to some other tool now.

    To make things slightly better you might think about the following:

    Create a new profile to keep your main profile clean. Then create a filter that replaces the “not provided” with something like “NP = url of landing page”

    At least now you know which LP’s were triggered by the various “not provided” keywords which might give you a hint about the real keywords.

    Hope it helps.

    Reply
  6. Jon Hibbitt

    The suggested reason for excluding the keyword data from organic is privacy. Unless you’ve paid for Adwords (Google’s primary business model) in which case you get the keyword data. Ahem – and the privacy of anyone clicking an Adword isn’t an issue? I guess not, as there would be no business in Adwords! Thanks for the https SSL heads up – looks like the way to go.

    Reply
  7. Deric Loh

    Seems like the message they are sending out is that user’s privacy is only secured if it’s purchased via ah-em Adwords. Leaving too much to organic revenue on the table > turn it off > more revenue $$ for Adwords.

    Reply
  8. Jacques Warren

    Hi Brian, OK, I see your point. But since all the “free” stuff is, to say it like it is, **subsidized** by PPC, which is almost all the company’s revenue, I’d say the real underlying message is: “Buy even more Adwords”!

    But maybe you’re right; they were just plain moronic when that was decided…

    Reply
  9. Jacques Warren

    “Buy Adwords” is the underlying message…

    Reply
    • Brian Clifton

      @Jacques – lol. Google offer a huge range of free (and very good) products, so I don’t people have an issue with that. Rather than an attempt to extract more revenue, I feel this just hasn’t been thought through fully by them. There is simply too much money – and good will – on the table form advertisers for them to play silly games like that.

      Protecting privacy is laudable, but their reasoning breaks down when a) they allow the detail via an Ad click, b) don’t allow the detail via https connections.

      Reply
  10. Brian Clifton

    @David – I have looked into this further and I stand corrected. Google is removing the keyword data with their redirect even if your landing pages are https.

    Kinda odd isn’t it? I just don’t get how this ties in with Google’s privacy argument for doing this…

    Reply
  11. David Shapiro

    I’m working with a site that has SSL on most of the pages but not all. I was looking up keyword referrals to that URL and it still had (not provided) referrals. Does the entire site have to be https? Also, does Google Analytics have to be tagged in a certain way to have this data?

    Reply
    • Brian Clifton

      @David if you are using the latest ga.js you should be fine by default. Note, only the landing page URL is required to be https for the keyword to be captured in the cookies. Feel free to ping me your URL and target keyword off-list and I will take a look.

      Reply
  12. Robert Regehr

    Is it worth running your entire site on SSL to get this data? I have a verisign certificate installed and am toying with the idea of doing this for a wordpress site.

    Reply
    • Brian Clifton

      @Robert – yes, absolutely if you have the certificate already, redirecting all your visitors to the https version of your site will solve this problem 😉

      Reply
  13. Brian Clifton

    @Mihir – the issue of “not provided” affects all websites – regardless of what tracking tool you use. It is the technical implementation of SSL for searching that results in this.

    The only way around it is to run your site in SSL – see my previous comment.

    Reply
  14. Mihir

    Hi Brain,

    ‘Not Provided’ has really made it difficult for the SEOs to analyze their traffic. With the increase in numbers of not provided in GA, I don’t know whether to stick with Google analytics or go for a paid tool which can help our clients. Any idea how we can track the (not provided)keywords data?

    Recently ahref and other paid tools stopped providing keyword data and I guess google team notified them to stop providing such data.

    Reply
  15. Brian Clifton

    The only way around this is if the connection between Google and your website is always https. That way, all referrer detail is passed through. I think that is the way the Internet will go…

    https is very straightforward to setup on a webserver – almost just the selection of a checkbox. The issue is that without a valid certificate, a warning message pops ups to the user. Everything is encrypted, but the browser is also expecting the identity of the site to be verified (i.e. by Verisign, Comodo, Thawte etc). And of course that costs money…

    Unfortunately you cannot have one without the other, which is what we are all looking for to solve this problem.

    Reply
  16. Greg Moore

    Six years ago, Google would not show the search queries that brought in AdWords visitors due to “privacy” concerns, yet all the search phrases that brought in organic visitors were shown in GA.

    Today it’s exactly the reverse.

    There is something I thing of as “Google Speak” – pronouncements from on high that lack a rational basis. It’s paternalistic and patronizing. Things are the way they are “because I said so.”

    When Larry David is at. Starbucks and he says, “give me one of those vanilla bullshit things,” he slyly reveals the whole Starbucks thing is 20% reasonably good coffee and 80% bullshit.

    Google is the same way. They are vulnerable. Their true motivations are obviously not their publically stated ones.

    The rise of not provided prevents me from optimizing the last mile of search. I can’t provide satisfying content for search queries I can’t see.

    Google’s huge success is like a stock market bubble. Larry David could easily say, “Do one of those bullshit Goofle searches. Oh, my, a box to enter a query and a page of search results. Wow. Who would ever have thought of that.”

    Reply
  17. Pramod

    Correction , i meant integratin Google webmaster with GA.

    Reply
  18. Pramod

    So what about SEO data we get by integrating GWO with GA? is that also impacted

    Reply
  19. Robert Regehr

    I find it hypocritical that Google provides data to Adwords advertisers but withholds it from everyone else in the name of “Privacy.” Clearly dollars trump so called privacy concerns.

    Reply
  20. Per Pettersson

    Very interesting that Google AdWords is not effected by this. Looks like everybody that needs keywords will end up paying for ads. Keep us posted on the subject Brian! Cheers.

    Reply

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share This