Sharing research data ‘a work in progress’

Journals’ predilection for data availability statements makes little difference to readers’ chances of getting their hands on the data

June 21, 2023
False data illustrating need for oversight in South-east Asian universities
Source: iStock

A push to include “data availability statements” in journal articles, to help improve the reproducibility of published studies, has done little to boost researchers’ access to each other’s underpinning findings.

A study by analytics company Digital Science has found that the coronavirus pandemic spurred rapid growth in the adoption of data availability statements, with their prevalence more than doubling in 2021. Nevertheless, their use remains patchy, ranging from 98 per cent of papers at one publisher to 14 per cent at another.

And their impact is even patchier. An analysis of journals in the chemical sciences found that data availability statements were present in about 93 per cent of the papers produced by AIP Publishing, 86 per cent at MDPI and 28 per cent at Springer Nature. In all three cases, only about 5 per cent of the papers featured links to online repositories containing the data, with a temporary hosting service called GitHub among the most popular.

“Just because you require a data availability statement doesn’t mean the data are going to be more likely to be there,” Leslie McIntosh, Digital Science’s vice-president of research integrity, told an Australian webinar.

Her company has developed the researcher equivalent of services such as Turnitin, which helps to detect misconduct among students. The “research integrity app” analyses “trust markers” – data availability, ethics, funding, author contribution and conflict-of-interest statements – across tens of millions of publications.

Dr McIntosh acknowledged that there was a “compliance problem” with data disclosure. Researchers “don’t have time to ask for data”, she told Times Higher Education. “It needs to be standardised, and repositories are the way to do it. We might as well forget that the data is available if you haven’t actually shared it.”

Dr McIntosh said privacy and ethical considerations sometimes prevented data disclosure. Time was also a factor, she added, confessing that she had forgotten to upload the data from one of her own papers: “We cannot put all of the onus on researchers. It takes time to work with data and make it available.”

She said things were improving, with some researchers finding “great ways” to boost the reproducibility of their studies. “They’ll give the code. They’ll make sure that you have synthetic data so you understand how to run the code. Some people are getting very creative.”

john.ross@timeshighereducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Related articles

Sponsored