Data democratisation is a term that is perhaps not very well understood, but as this presentation highlights, it can be vital for research organisations to take seriously. At the beginning of his talk, Sam Jackson shared a definition of the concept from IBM: Data democratisation seeks to “make data more accessible to non-technical users, in part, by making the tools that access the data easier to use.”
Jackson stressed that for his purposes, this definition doesn’t necessarily refer to bioinformaticians, rather, those that work in wet labs, whose expertise focus more on tissue than coding. Here, data democratisation about creating the ecosystem in which to work with data more effectively. Underpinning the concept is of course the principles of FAIR data: data that is findable, accessible, interoperable, and reproducible.
Jackson pointed out four main benefits of democratising data. First, it enables more people to work closely with data increases the generation of insights based on it. Second, it maximises return on the investment put into advanced data generation techniques. Third, data democratisation drives quality and reproducibility of data due to more people reading and scrutinising the datasets. Finally, it and encourages collaboration among academics to define future needs in spatial biology data analysis.
To fully realise the potential of data democratisation, Jackson said that it helps to consider the challenges with its implementation. He highlighted three key questions: What does the optimal integration of data democratisation look like? What is the current status of the data at hand? And how do we get to where we want to be?
Jackson then discussed the importance of interoperability and file formats in spatial biology, emphasising the need for methods and tools to convert spatial transcriptomics data to a common format. He highlighted the benefits of creating an open spatial data format, which would enable data sharing, comparison, and community development of tools. Additionally, Jackson proposes the idea of an online data repository for spatial omics, supported long-term, to facilitate visualisation and interrogation of data.
The talk concluded by considering three models for advancing data democratisation: international large-scale model, international purpose-specific model, and national model. Jackson suggested convening a meeting to discuss the future of data in spatial biology, involving large funding bodies and starting the conversation about community-led initiatives.