Data should be as open as possible and as closed as necessary

This should be the new mantra for researchers in a post-Facebook world, argues Andy Turner 

April 29, 2018
Source: iStock

Research data might not sound like a topic to get evangelical over, but people are lacking help that they desperately need because we cannot make use of data that might otherwise save them. It’s argued that our collective failure to save lives through data sharing is a type of crime and morally it seems wrong if we can’t analyse data to ensure longer, happier lives. 

Data are often not available for analysis because legally accessing and integrating data is difficult. Quite rightly, there are concerns about data misuse and many people argue that there are no completely safe settings and controls to prevent this.

People generally don’t want the authorities to have access to data about where they spend time and what they do. However, sometimes it might be necessary to know these details, for example, to identify environmental causes of disease. This is a far cry from trying to figure out how you might vote or what you are thinking of buying.

People might not want to share their lifestyle data but may be comfortable with using black box recorders, giving details of how their vehicle is being driven. It is one thing to provide this detail to try to prevent accidents and make roads safer, but a completely different thing to use it to raise insurance premiums.

We digress, but the point is that sharing data at given moments relates to people’s perception of risk. The key thing to gaining greater acceptance of data collection and sharing is to have secure ways to ensure that it is used for the purposes that we want it to be used for and not for other purposes.

Many big social media and other organisations have a lot of data about us and they kind of own it, but under new General Data Protection Regulation (GDPR), we as individuals (in the European Union) will effectively regain ownership of these data. Before you rush to request that organisations delete your personal data, consider that it might be helpful in efforts to give you and your loved ones longer and happier lives. Yes, you want your data used in a safe and secure way, but technology has advanced to a stage where this is a realistic possibility.

Beyond personal data, there is a huge amount of data about places and things that can be indirectly linked to us by coincidence. Almost all this data is interesting from an academic research perspective, but what becomes research data are data that are used in research. If data underpins our understanding of how things are, how they were and how they will be, data are fundamental building blocks of knowledge – it is vital for science itself that much of this data is openly available. That said, a significant proportion should not be made openly available; the challenge is to make this as open as possible (to maximise its utility and impact), but as closed as is necessary (to reduce the risks of inappropriate use).

Much of the existing data that is potentially of use in research are controlled by non-academic organisations, some of which are commercially sensitive and owned. Increasing amounts of data are collected about us by organisations, but knowing what these data are, accessing and then integrating them is currently not straightforward, but it needs to become easier to volunteer our own personal data to collective research efforts.

It is usually not possible to buy licences to commercial data just to see if they might be useful; perhaps they legally can’t be made available, or should not have existed in the first place, or maybe the commercial organisations just don’t want to share them – why should they, if they are not legally obliged to?

Putting this potential research data to one side, it is unfortunately still the case that much of the data collected and used in previous academic studies is not made available in a way that can be readily used in further research. Sometimes this is legitimately due to not getting consent from participants or because of promises to destroy the data, but often it is out of a lack of awareness that this should be done. Researchers may not have the means, but sometimes it is laziness or a systematic failure whereby academics perceive to gain advantages over competitors by not sharing.

The crux of the issue is that if the data are not accessible or the datasets don’t talk to one another, the knowledge available to us, as a global society, stops in its tracks – a potentially life-changing result of the ordinarily frustrating “computer says no”.

Whether the data that we can access as academic researchers is interoperable can also make or break careers, but more importantly, the impact of our research is stunted. In my role as a data champion for Jisc, this is ultimately what we are seeking to overcome, through supporting our colleagues to effectively manage their research data and helping them to share it as openly as possible.

If the data we produce sits in silos, then we do not only do ourselves a disservice, but might also be guilty of not saving lives, both now and in generations to come.

Andy Turner is a research data champion for Jisc and a researcher, software engineer and teacher specialising in computational geography at the University of Leeds.

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please Login or Register to read this article.

Related articles

Related universities