If there’s one thing you can rely on data to do, it’s continually change. Seeing as metadata is the data used to describe data, that’s always changing as well.
Working with the Data Standards Authority, it’s part of the Office of National Statistics' (ONS) mission to make sure using and sharing data across government is as easy as possible.
Making metadata accessible is a big part of this.
Doing metadata well comes down to ensuring the provenance, integrity and quality of the statistics that will be produced from the data that the metadata describes. We don’t simply want to know where data has been used - we also want to know what variables have been used when the same question has been asked across several of our surveys.
Good metadata should:
- increase data literacy across an organisation
- be designed to be reused and not replicated
- be easily governable
- have traceable lineage to improve user trust
- offer better analysis, integration, and harmonisation across everything the data is meant to do
We work hard to do all that while also making metadata more accessible for people who use it, and we thought we’d share some of how we’ve been meeting that goal.
Refining our model
When we blogged last year about why metadata is important, we were testing our metadata model value in a data catalogue - an organised inventory of data assets across an organisation.
We used an open-source metadata tool called Mauro Data Mapper, along with public data, to demonstrate the value of the catalogue across the organisation - in this case ONS ourselves. This included details of statistical variables used in our work, such as surveys, and in producing statistical output. We also integrated and harmonised the 19 business glossaries we had found during our discovery into a single, highly polished glossary.
We then developed and tested our:
- user journeys, so we understand the metadata roles, responsibilities, and where they sit in the organisation
- metadata model, to understand the importance of different elements and where in the lifecycle they are captured and managed
- minimum set of mandatory elements that we have tactically implemented to support testing and developing the metadata collection and governance process
- single compiled business glossary (but more on this later, because it’s worth spending more time on!)
- deployment of the tool to our cloud platform so we can use real data
Tools of the trade
Another way we’ve been making metadata work better for our users is by taking a closer look at tools available across the data community, as well as the use cases that support them. We did this to try and find out wider impacts of choosing a tool, and how we could focus learning and testing to better explore that.
We learned quite a lot from this review, finding out that:
- governance and operating models will have a big impact on cost, and so possibly on the tools you may end up choosing - open source tools may be free, but still have an operational cost
- automating the way metadata is gathered supports user trust, and if you make the process of data ingestion and curation demanding, you risk lower quality data
- any metadata catalogue - or application generally - will live or die by the user’s ability to find what they need, so search and navigation are key to early user buy-in and adoption
- some tools have excellent metadata ingestion and curation, but an interface aimed at technical users, rather than the majority of people, which can mean extra training costs
What did we learn?
Overall, we learned that if you are going through a digital transformation, metadata needs to be at the centre of the work to unlock the full value of your data.
Structural and administrative metadata lineage in your catalogue and business glossary are important tools in supporting conversations, and understanding the use of data across organisational domains.
We also learned not to rush.
You need to understand how you want to manage and curate your catalogue and glossary. Otherwise, you risk ending up with an expensive tool and a failed enterprise metadata initiative.
If you concentrate on usability, you can ask really good questions: does your search return too many false positives? Could a controlled business glossary improve the precision?
Engaging with the organisation is essential to providing this feedback and understanding the interoperability of your systems, making it as easy as possible to discover and collect your metadata. And while we’re talking about engagement, you may have to work on a broader cultural change to make sure everybody understands and supports your need to collect and manage metadata.
We’re looking forward to continuing our metadata journey as part of the Data Standards Authority, and using what we’ve learned to make data-sharing work better for everyone across government.
To talk to us about your own journey, or anything else you think we can help with, please get in touch with ONS by emailing data.architecture@ons.gov.uk.