Skip to main content

Python

Using Data Science for Next-Gen Statistics

Rap sticker on a laptop

As the 21st century progresses, using data effectively has become a priority for many organisations, including the Office for National Statistics (ONS). The ONS's unique focus, however, goes beyond just utilising data effectively. The organisations ultimate goal is to create …

Splink: Fast, accurate and scalable record linkage

Posted by: , Posted on: - Categories: Data Engineering, Data science, Python
Some of the graphical outputs of Splink

  A common data quality problem is to have multiple different records that refer to the same entity but no unique identifier that ties these entities together.  For example, customer data may have been entered multiple times by accident, or …

The Data Science in Transport community just got bigger

Posted by: , Posted on: - Categories: Data science, Events, Python, R
A room of people watch a presentation. The two are demonstrating data manipulation by holding up a large paper ring

The 23rd of January 2020 marked the biggest Data Science in Transport community event to date. People from across academia, industry, and the public sector came together for a hack, conference, and networking event to learn from each other.

Using XPath and Python with the Google Analytics reporting API to report on a large data set

Posted by: , Posted on: - Categories: Data science, Google Analytics, Python
highwaycode, xpath and python script

A year and a half ago, two GDS Designers asked me, “Can you show us how Highway Code content on the GOV.UK site is performing?” This would have been a simple request, were it not for the sheer number of pages …