Skip to main content

https://dataingovernment.blog.gov.uk/2014/05/22/hacking-google-analytics-to-get-real-time-internal-search-terms-data/

‘Hacking’ Google Analytics to get real-time internal search terms data

Posted by: , Posted on: - Categories: Data insights, Google Analytics, Implementation
GOV.UK search box
GOV.UK search box

I’ve previously blogged about how we’ve been experimenting with real-time data to create dashboards and custom automated email alerts. In this post I want to share how we’ve been ‘hacking’ Google Analytics to gain useful insights into what our users are currently searching for on GOV.UK.

On average, more than 100,000 internal searches are made daily on GOV.UK using more than 30,000 different terms. Internal search terms are a useful way of listening to what our users want, in their own words.

Such a vast amount of data makes it quite challenging to identify discerning trends, but grouping search terms under relevant topics can give us clues on what is currently of interest to our users.

The challenge is compounded by the fact that Google Analytics by default does not show real-time internal search terms. However, we’ve found a way around this by using an Advanced filter to get Google Analytics to alias search terms as if they were page URLs.

First, we created a separate View (formerly known as a Profile) to only show our internal search terms, using this filter:

Filter to show only search pages
Filter to show only search pages

Note that we stripped query parameters out of URLs when we created the View. We next created a filter to lower-case all search terms, so for example ‘tax credit’ and ‘Tax credit’ would both be aggregated together as ‘tax credit’ as they are essentially the same thing.

Filter to lower-case search terms
Filter to lower-case search terms

A third filter is required which aliases the search terms as URLs. The settings for the filter are below:

Filter to alias search terms
Filter to alias search terms

Filters can take up to 24 hours to take effect but once activated, instead of seeing this in the All Pages report:

All Pages report before applying the filter
All Pages report before applying the filter

we see this:

All Pages report after applying the filter
All Pages report after applying the filter

This transformation now allows us to make use of the built-in real-time reports using the Page dimension to represent the search terms.

Active searches dashboard
Active searches dashboard

This may be adequate for smaller scale sites but for GOV.UK we wanted to drill further and show terms by topics we were interested in, such as driving and theory tests.

We achieved this by first collating terms that related to driving and theory tests and then creating a regular expression which matched those terms:

(.*|^)(theory test|driving test|driving licence|provisional|d1|driving license|provisional licence|practical test|provisional license|driving theory|book theory test|book driving test|practical driving test|driving theory test)(.*|$)

Then it was a case of configuring a real-time widget to show how many internal search terms were being made which matched the regular expression:

Widget settings for active searches of driving and theory test
Widget settings for active searches of driving and theory test

This would only give a number so to enhance it further we created a supporting widget which showed the actual terms:

Widget settings for driving and theory test terms
Widget settings for driving and theory test terms

This now results in a powerful real-time dashboard showing the current numbers of users searching for a particular topic and the terms they are using. In the dashboard below I’ve also shown examples for car tax and pensions topics.

Final version of the dashboard
Final version of the dashboard

The added advantage of this method is that these widgets can be created ‘on the fly’ for any topic, and dashboards can be shared with departmental stakeholders for them to view.

We’ll be working more with these real-time internal search term dashboards in the near future and will share further experiences. In the meantime, we’ll be interested in knowing if you have any good use cases for them.

Sharing and comments

Share this page

7 comments

  1. Comment by Dominic Hurst posted on

    Great post Ashraf. Jim and Peter tipped me off to this at measurecamp as a great way to raise awareness to data using "real" metrics, ie the words of real users.

    All works a treat now at NICE, with further exposure on plasma screens. As you mention plugging this into the api to get on the web will bring further exposure too.

  2. Comment by Joshua Mouldey posted on

    Very clever!

    It would be great to put this data alongside the other real-time user data from an unfiltered view, but I think that's going to require using the API.

    • Replies to Joshua Mouldey>

      Comment by Ashraf Chohan posted on

      Hi Joshua

      Thanks for your comment. It's not currently possible to put this data alongside visits data in the interface but we hope to further experiment using the API to build more powerful dashboards.

  3. Comment by Shuki Mann posted on

    This is brilliant man!
    I really loved that 🙂

  4. Comment by Rachel Purkett posted on

  5. Comment by portfolioseo posted on

    Great solutions, thanks to share.

  6. Comment by Nathan Wall posted on

    Brilliant solution.

    We implemented this on our beta site this morning, makes monitoring of local search easy. Thanks for sharing the workaround.