How we are using machine learning to detect GOV.UK feedback spam
The GOV.UK feedback form was receiving a lot of spam requests. We developed a machine learning model to detect spam responses — here is how we created it.
The GOV.UK feedback form was receiving a lot of spam requests. We developed a machine learning model to detect spam responses — here is how we created it.
...techniques to obscure sensitive or private information in datasets. These include statistical methods, deep learning techniques and natural language processing for the data types above. Typically they can be summarised...
...and the public sector. Machine Learning is a method for creating algorithms that enable computers to learn from data to make predictions. Examples of machine learning techniques are; reinforcement learning,...
...environment (IDE), like that of Visual Studio or IntelliJ – a tool which makes the process of writing high-quality software easier through various integrations such as offering code optimisations and...
...to bring our false negative rate down below 20%, so we estimated a good sample size for each experiment; each algorithm’s related links would need to be live for about...
Read about Natural Language Processing projects happening in data science teams across government
...links to other related pages. Then certain navigational links that facilitate browsing, like breadcrumbs and other items from the relevant topic in the taxonomy, are automatically linked to the new...
Representing text as vectors We can represent the text on each GOV.UK page as a semantic vector. This is denoted by a list of numbers, which conveys information about the...
We're using supervised machine learning to organise all the content on GOV.UK, which means we can do things like create step by step journeys and consider voice activation. Here's what the data science team did.
One of the questions the data science team at the Government Digital Service (GDS) often gets asked is ‘how does your team work on data science projects?’ We thought we’d blog about how we have evolved our approach to data …