Skip to main content

Data Engineering

Using Data Science for Next-Gen Statistics

Rap sticker on a laptop

As the 21st century progresses, using data effectively has become a priority for many organisations, including the Office for National Statistics (ONS). The ONS's unique focus, however, goes beyond just utilising data effectively. The organisations ultimate goal is to create …

Splink: Fast, accurate and scalable record linkage

Posted by: , Posted on: - Categories: Data Engineering, Data science, Python
Some of the graphical outputs of Splink

  A common data quality problem is to have multiple different records that refer to the same entity but no unique identifier that ties these entities together.  For example, customer data may have been entered multiple times by accident, or …

Engineering the data of the future

Post-it notes on a window showing a data engineering pipeline from raw data ingestion to schema-on-read by users

Like most organisations today, the Ministry of Justice (MoJ) wants to use its data more effectively. The goal is to make sure that people making decisions have the insights they need at the right time to guide their decision making, whether that’s front-line prison staff or senior civil servants.