Skip to main content

What does it all mean?

Posted by: , Posted on: - Categories: Data science

The UK runs lots of public consultations – over 500 since January! Not to mention the EU, Scotland, Wales and Local Authorities... but, how can we get better at learning from all this consultation?

As part of our data science programme, GDS have been talking to the European Commission about their recent consultation on copyright. We met the EU Director General for digital (@eurohumph) and a team from DG Markt (the EU department that makes rules on free trade).

The copyright consultation had 80 wide-ranging questions with 9,599 responses, in 26 languages, and different document formats. Teams of European Commission staff have now read them all. GDS worked with Ripjar (a UK startup specialising in Data Analysis) which looked at the publicly available consultation responses on an exploratory basis, to illustrate how algorithmic analysis can complement the manual work for the teams of officials involved.

We saw how a computer programme can be trained to spot patterns in the text (using machine learning). Below is a visualisation of question one from the consultation. This is not a standard word cloud – the size and position of the words depends on their significance and what other words they are related to in consultation responses:

Ripjar Screenshot 3

This highlights the issue of access to videos (after legal action by GEMA, a German performing arts group). The tool also identified cases where many respondents used the same text (perhaps from industry leaders or trade associations). It summarised responses and extracted key messages after a few hours of analysis, and was recognised as a really useful tool by Commission officials, to complement the work of the manual analysis by their teams of expert reviewers.

We’d like to see similar machine learning used in more UK consultations. It could make better use of human readers (eg by passing them responses to analyse in a more helpful order) and help make sense of overall messages. We want to give ministers looking at the enormous volume of the public’s responses another way to ask: “what does it all mean?”

Sharing and comments

Share this page


  1. Comment by Sascha posted on

    ... because computers are really good at detecting sarkasm, right?

  2. Comment by Fraser posted on

    Erm, if you're talking about computer assisted qualitative data analysis then I think you'll find that the tools are already out there (e.g. QSR Nvivo). They already have visualisations and they integrate nicely with social media and the web too.

    Natural Language Processing is flaky, in my honest opinion.