Skip to main content

https://dataingovernment.blog.gov.uk/2018/10/08/data-science-is-a-team-sport/

Data science is a team sport

Posted by: , Posted on: - Categories: Data science, Machine learning, People and Skills
The GOV.UK Publishing Workflow team with an embedded data scientist
The GOV.UK Publishing Workflow team with an embedded data scientist

One of the questions the data science team at the Government Digital Service (GDS) often gets asked is ‘how does your team work on data science projects?’ We thought we’d blog about how we have evolved our approach to data science projects and what we’ve learned during that time.

The data science team started at GDS more than 4 years ago as a handful of data scientists and policy advisors. Our aim was to practically demonstrate to senior stakeholders across government the opportunity for data science and support departments’ strategic ambitions.

Our initial work focused on rapid prototyping to show what was possible to inform policy issues. Although these outputs were statistically robust and helped shape strategic thinking, it rarely progressed to a scalable product or service.

A lot has changed since then. The ONS Data Science Campus has joined us and the Government Office for Science in supporting capability development through the Government Data Science Partnership.

We’ve grown our data science community to more than 1,200 people across the public sector and the Data Science Accelerator has seen some 100 analysts graduate - with many going on to become data scientists in their organisations.

Within GDS our range of platforms, products and services have developed to the point where machine learning can be applied at scale and shape product roadmaps.

Along the way we’ve learned from others in industry and government, and from our own experiences on what it takes to apply data science. We’re now seeing parts of government succeed with data science in large part by building an appropriate team.

As a result of our experiences we have 3 important lessons:

  1. A multidisciplinary team is essential.
  2. Data science is not the product.
  3. One product or service at a time.

A multidisciplinary team is essential

The initiation of a data science project can come from many places. It could be a query from a minister, an urgent operational need or just an opportunity someone spots for using an innovative approach. Often to meet the user need, it requires a new product or service to be built.

This new product or service usually begins with a discovery phase with user research, an exploration of data, potential techniques and proof of concepts. However, the next step to building an alpha prototype - and potentially onto a scaled beta product - is where development can stall due to a lack of the right skills or capacity in the delivery team.

In the worst case, the team may attempt to make the leap from early prototype to a live product. This is nearly guaranteed to fail through design flaws, single points of failure and hero coding.

The need for a multidisciplinary team is so crucial it is assessed in detail in the Digital Service Standard. Having a diverse range of specialist skills and experiences has been a noted factor in high-performing teams and even teams of robot workers given construction tasks.

So if you have ambitions to scale machine learning and broader artificial intelligence (AI) applications in your organisation then ignoring this principle has consequences.

A team of data scientists is not a data science team. When a team only consists of data scientist (or closely related) roles, the team will adapt to cover roles like delivery manager, user researcher or an interaction designer. This team model does not make the most of the data scientist's technical skills in machine learning and AI, and can lead to ‘data scientist’ roles being filled by individuals who do not meet the actual role requirements. This undermines the value data scientists bring to an organisation.

At GDS we now embed data scientists in existing product and service teams. This ensures the data scientist can focus on using their specialist knowledge and make use of functional overlaps of roles like software developer, data engineer and performance analyst. This increases the pace of delivery and means learning occurs where there are skills and knowledge overlaps between specialists.

Data science is not the product

One of our strategic aims as a team at GDS has been to reduce our efforts on ‘pretty’ data science that uses single page applications for data visualisations and dashboards. These have value in communicating the data story but should not dominate a data scientist’s time.

Instead, we try to focus on applying machine learning to existing products and services where it has matured to the point where machine learning and AI can usefully be applied. This follows the data science hierarchy of needs.

The current AI technology marketing can mean the output of data scientists is seen as the product - but this is wrong.

Machine-learning-as-a-service commercial offerings pitch AI as the product, but for digital services in government the work that data scientists do is only part of the product. There is a much wider set of roles and thinking that needs to happen to make a product that meets the user need.

One product or service at a time

There is wealth of evidence that focused activity is a hallmark of successful teams, so avoiding partial allocation of specialist time is preferable.

At GDS we’re trying to build more resilience into the work of our data scientists. In the first instance we are trying to ensure data scientists can focus on a single piece of work as part of a much bigger team.

As a data science function, we’re using retrospectives and team health checks to make sure data scientists can share issues and identify collective opportunities to improve how we work.

We’re also trialling team designs that embed data scientists in pairs where possible. By pairing up and following the ‘two is one and one is none’ pattern we can apply data scientists with a mix of technical skills and experience. This supports team resilience and knowledge sharing that goes beyond the functional practices like version control and team stand-ups.

Following these 3 lessons when tackling data science opportunities requires leadership above the team who can recognise when and how data scientists can add value to products and services. Being mindful of technology hype cycles and avoiding doing 30% too much means we can realise the opportunity for data science to be pervasive and transformational.

If you’re interested in learning more about how to make the best use of multidisciplinary teams and agile delivery, you can attend the ‘Hands on agile for leadersor the 'Agile for teams' course run by the GDS Academy.

Sharing and comments

Share this page

3 comments

  1. Comment by Nithya V posted on

    Agree with the need for a multidisciplinary team in data science projects.At my organization, we follow similar approach.Members of our Data Science teams take up pair programming and each of us leverage our individual strengths in Python and R to build up the model.

    https://honingds.com

  2. Comment by Liz posted on

    Amazing seeing you work over the summer.

  3. Comment by Bill posted on

    Thanks. Great blog that’s clearly grounded in the pain of real life project challenges.

    Cassie Kozyrkov, the Chief Decision Scientist at Google Cloud, shares many of the same lessons in this excellent DataCamp Data Framed podcast: https://www.datacamp.com/community/blog/decision-intelligence-data-science