Skip to main content

The Data Science in Transport community just got bigger

Posted by: , Posted on: - Categories: Data science, Events, Python, R

Wide-angle shot of around 100 people watching a presentation

I am hoping that I will remember 2020 for plenty of good reasons, but one memory that has already made the cut is co-hosting and co-organising the Data Science in Transport event on January 23rd.

If you haven’t heard of the event before you can be forgiven as this is the first time we, the Analytics and Data Division at the Department for Transport (DfT), have opened up our long-running event to those outside the public sector. The aim of the event is to bring together data scientists, data analysts, and data wizards from across the transport sector to learn from each other by sharing our successes and pains.

To celebrate our extended audience we threw our biggest event to date, hosted at the Microsoft Reactor in Shoreditch, London. Our participants spent their morning taking part in a mini hack. Using their programming skills they were able to show the variability of mobile phone signals along the nation’s railway lines. This was the first hack we’ve hosted, and it was a trial for future hack events to see if this method of tackling data challenges would work for us.

In the afternoon the event continued into a conference where data folk from across public, private and academic sectors presented their work to the community. There is a full list of speakers and topics at the bottom if you are curious about who we heard from.

Groups of people sat on desks coding on laptops

The value of new data that lets you see which train routes have good mobile signal strength

In our mini hack Alex White, from Transport for London (TfL), used new mobile signal strength data from Ofcom, which maps signal along the rail network using Network Rail’s Yellow engineering trains data. In just a couple of hours, Alex successfully showed us the variability of rail coverage by mobile phone network across the railways.

In conversations with people on different mobile phone networks I had never twigged that the degree of signal loss I experience is not universal. In the future I will also make use of the on train wifi to improve my journey.

Be kind to your fellow man – be open source

During the hack most of us used R and Python to try and crack our problem using geographical analysis. To do this we had to use lots of open source packages that I often take for granted.
We were lucky enough to have Dr Robin Lovelace along. Robin is a contributor to the R spatial analysis package sf and developed transport planning package stplanr. He shared that starting your contribution to open source doesn’t need to be intimidating: find a spelling mistake in documentation, or note what bothers you when you use a package and then add your ideas as a bug/issue request on GitHub. I had never realised before how little it takes to get involved.

Predicting which potholes are en route to deterioration is noisy

Frank Kelly from Hal24K spoke about his mission to map road imperfections and predict which ones are set to deteriorate. Hal24k do this using data from survey vans with sensors that drive around the Isle of Wight. In theory this is great news for all of us, but in practice has involved overcoming many unexpected data challenges.
He showed us a wide range of data science techniques to get around these difficulties. Solutions included finding undocumented road repairs with an intervention changepoint detector using Bayesian inference (Python package: pydata-bayes-changepoint) and aligning GPS data that contains drift by applying a Hidden Markov Model (Python package: Leuven.MapMatching). Crafting and data science are more alike than you think

Crafting and data science are more alike than you think

Lizzie Baggott, our Head of Data Science at DfT, gave a keynote where she did the highly improbable. She demonstrated the process of data manipulation (cleaning, joining and analysing data) via the medium of craft – joining paper datasets together with sticky tack and doing some clever folding and cutting to reveal the hidden links.
The demonstration was to show Lizzie’s vision to bring together more sectors (and their perspectives) into the ‘Data Science in Transport’ community. The crafting needed to be seen to be believed but was a brilliant way to combine some of her interests with data science.

A room of people watch a presentation. The two are demonstrating data manipulation by holding up a large paper ring

It was a full and fun day, and my thanks go to all the speakers/attendees/co-organisers for making the day what it was. I hope that those who attended have made new connections to other transport data enthusiasts.

Our team intends to have a bigger hackathon event in 2020 and would love to have an even more diverse cast of contributors and attendees. If you want to come along next time or have something to present, then email us at to join the mailing list.


Lizzie Baggott, Department for Transport, Keynote
Dharmender Tathgur, Department for Transport, “Aviation noise and machine learning”
Dr Robin Lovelace, University of Leeds, “Transport Data Science: from regional to street levels”
John Spanton, Dr Steven Keen and Margaux Dumon, Valtech, “Delivering radical change in transport with Data Science/ML”
Dr John Carney and Dominic Duxbury, PDFTA, “In-Journey Route Optimisation”
Ian Gordon, Highways England, “Graph Databases to Map Together HE Data”
Associate Prof. Theo Damoulas, University of Warwick, “Pollution Forecasts in London”
Dr Myriam Neaimeh, The Turing Institute, “Using Data Science to Modernise Transport and Electricity Infrastructure”
James Lambert, Department for Transport, “Modelling Traditional and Autonomous Modes Split using R and RShiny”
Frank Kelly, HAL24K, “Predicting Road Degradation”
Guy Bewsher, Ordnance Survey, “Managing the Need for 3D Transport Data”
Dr Craig Smith, Agilysis, “Road Danger Reduction Methodology”

Sharing and comments

Share this page