https://dataingovernment.blog.gov.uk/2017/10/10/developing-a-standard-approach-to-implementing-analytics-at-the-dwp/

Developing a standard approach to implementing analytics at the DWP

What is the best way to implement Google Analytics (GA) on Department for Work and Pensions (DWP) services? As DWP digital performance analysts, that is the question we asked ourselves last year, after realising that different services were often tagged differently. After a lot of head scratching, workshops, and experiments with mocked up services, we’ve begun creating a standard approach to implementing GA, some examples of which I’d like to share with you.

There are real benefits in creating a more standard approach, not least of all in saving time. We realised we were spending significant amounts of time working with services to work out how we should tag each page. This meant we were effectively spending time creating a measurement framework each time a new service was set up. Also, there was a time lag if we identified a need to tag something that hadn’t originally been tagged.

As each service was tagged differently, if one of the team needed to do some analysis on a service they hadn’t set up, they had to learn the idiosyncrasies of how GA had been implemented. Comparing services was also difficult, as they might be recording similar actions differently.

Our project had two aims:

  1. Identify or develop the best way to tag different kinds of components on services, for example, radio buttons, validation errors, and outbound links. The ‘best’ way is that which allows us to answer all the questions we think we might need to answer about users’ interaction with a component or the best compromise, where this isn’t possible.
  2. Develop a standard GA implementation specification that could be used for all new services with the least amount of modification required. This should reduce the amount of time we spend developing bespoke implementations, and ensure we have the data to answer questions when they come up, not a few weeks later.

We’ve made the result of our efforts publicly available in this fairly technical, and very comprehensive document. In this post I’ll cover radio buttons, linking to outcomes and session IDs.

Radio buttons

A lot of DWP digital services enable people to apply for a benefit, loan or grant, so they’re largely evidence-gathering exercises, where we ask users questions about their circumstances. Where only one response is possible to a question, for example ‘Yes’ or ‘No’, radio buttons are used. This makes them one of the most common components of DWP services.

As part of this work, we made a list of all the questions we’d been asked, and had asked ourselves about radio button questions. We also thought hard about the kinds of questions we thought we should be answering to identify ways we could add value. For example, we made sure we could answer the question ‘In what proportion of sessions does the user give more than one answer to this question?’.

This meant for each question we had an idea of how often users were confused by it or changed their answer on the basis of what happens next. Also, by monitoring multiple answer rates, we could identify questions that might need rewording or separating into multiple questions that aren’t picked up by user research or feedback.

Example radio button question
Example radio button question

After a lot of learning from failure and iterating, we settled on a standard way of tagging radio button questions.

On each click of an answer, an event is sent with the category ‘Radio-Click’, the action as the wording of the field, and the Label, the answer followed by the question’s internal reference. It’s important to make sure the label is unique within the whole property, otherwise you can’t segment based on the answer to a particular question (see my personal blog for a full explanation of this).

As part of the same event, a custom dimension is also set to the answer clicked on. This dimension is set to session scope, meaning each time it’s set, it overwrites any previous value for that session. This allows us to easily segment users based on the last answer given to a particular question.

The code that’s added to each radio button looks like this:

onClick=ga('send', 'event', 'Radio - Click', '[The wording of the field]', '[The answer] - [field internal reference]', {

'dimensionX': '[The answer] - [field internal reference]'

});

The custom dimension makes segmenting users based on final answer to the question much easier. Without the custom dimension, creating a segment based on the last answer given would require a sequence segment, which will also possibly be inaccurate.

Here is an example. On a question with ‘Yes - Q15’ and ‘No - Q15’, a segment for all those who gave yes as their final answer would need to include sessions with an event with the label ‘Yes - Q15’, while excluding sessions with the sequence ‘Yes - Q15’ followed by ‘No - Q15’. This would exclude users who answered Yes, No, Yes, and, for questions with more than 2 possible answers, would get extremely complex. With the custom dimension, the segment is simply ‘dimensionXX’ starts with Yes.

The result is we can answer all the questions we’re frequently asked for with radio button questions, and also most of the ones we think we should be asked. There are two caveats, however. How long users spend on each question isn’t that easy to measure. However, we felt this was acceptable, as the alternative would be to take a completely different approach using virtual page views rather than events. Also, using custom dimension does rely on having a GA 360 account or a very short service, as it’s likely to use more than the 20 custom dimensions available in the free version of GA.

This:

Segment on question 'Yes I want help - help with travel. Using our recommended method you can create segments based on answers to questions like this
Using our recommended method you can create segments based on answers to questions like this

Versus this:

Without using a custom dimension, to create a segment based on the answer to a question you’d need a filter that looks like this… ... want some help to get to work? - help with Travel CONTAINS yes
Without using a custom dimension, to create a segment based on the answer to a question you’d need a filter that looks like this…

Linking to outcomes

Linking to outcomes is less of a technical standard, more of an approach to our service. Wherever possible, we make sure we can link a user’s online behaviour, and their outcomes, which are usually measured offline. For example, when we work on a service that lets users claim a benefit, loan or grant we make sure we can match visits to the service with the admin data on outcomes, which we then import into GA.

This means we can identify and segment those whose applications were successful and those whose weren’t. We can then look at the behaviour of users who submit an application but aren’t awarded the benefit to see if there’s a point at which messaging around eligibility criteria isn’t as effective as it could be. If we don’t link online behaviour and outcomes, we’re not looking at the whole journey, so there’s a risk of optimising the service for clicks on the submit button, rather than the service’s actual objective.

In practice, implementing this has needed us to be a bit creative, as it’s not always easy to make sure there’s a key to match GA and admin data. On some services, we’ve managed to pass the session ID to the admin data. This is our preferred method, as it makes it relatively easy to extract session ID and outcome from the admin data and import it into GA.

Where the back end systems don’t allow this we’ve used other approaches such as using date of session and the first half of the user’s postcode to try to match GA and admin data. On one service where we use this method we can match about 50% of the cases, which is enough to be able to analyse trends.

Session ID

Our approach to Session ID is shamelessly copied from Simo Ahava (see who has posted on this subject). It’s fantastically useful. For example, gathering a session ID as a custom dimension lets you look at the distribution of time on page, rather than the mean average, which is all the standard implementation of GA allows.

This means we can use the median to measure session duration, which is usually a better indication of session duration, as it’s not affected by outliers as much as the mean is. A few hour long sessions won’t affect the median, but might have a big impact on the mean. Also, having access to the distribution means you can really understand how long users are spending on the service.

The distribution of time on page for sessions that completed the service
The distribution of time on page for sessions that completed the service

Next steps

We’ve got to a stage where we think we’ve at a good starting point for creating a standard approach to implementing GA, but it’s not finished. It probably never will be, not least because as GA itself evolves we’ll need to adapt it to keep up.

We’ve tested or approach fairly extensively within DWP and it works pretty well, but we’d be very interested in whether it’s useful for services in other departments or private sector organisations. We’d be particularly interested in any ways to improve it, so get in touch if you think you’ve spotted a way to make it better. We’re also more than happy to talk colleagues in other departments through it, so if you want to know more please contact me.

Mike Suter-Tibble is Head of Digital Performance Analytics at the Department for Work and Pensions.

Leave a comment

We only ask for your email address so we know you're a real person