High-quality and publicly accessible datasets should be given as much credit as standard publications in the research excellence framework, the Royal Society has argued.
In a major report on open data, released on 21 June, the learned society says that the potential of the internet to facilitate collaboration among both professional and amateur scientists “may pave the way for a second open science revolution as great as that triggered by the creation of the first scientific journals”.
The report, Science as an Open Enterprise: Open Data for Open Science, says the routine publication of datasets in intelligible, assessable and usable formats - which it calls “intelligent openness” - would also allow a new breed of data scientists to search for unsuspected relationships, such as between disease mechanisms and the properties of drug-like compounds.
Such openness would also improve the detection of scientific error and, in areas such as climate science and genetic modification, help to build public trust in science.
“The skill and creativity required to successfully acquire data represents a high level of scientific excellence and should be rewarded as such,” the report states. However, it complains, the “prevailing culture” views data as a “private preserve” and research assessment schemes typically focus solely on research papers.
The Royal Society believes that a “powerful motivation for data release” would be for REF panels to treat open datasets “on a par” with publications.
Although the REF does not explicitly exclude datasets, only one of the four main panels for the 2014 exercise refers to them explicitly in its submissions guidance.
David Sweeney, director for research, innovation and skills at the Higher Education Funding Council for England, which administers the REF, pointed out that 132 datasets had been submitted to the 2008 research assessment exercise.
He welcomed the Royal Society report and said that the REF team would take it into account in refining the way datasets would be handled in 2014.
According to Geoffrey Boulton, Regius professor of geology emeritus at the University of Edinburgh and chair of the working group that drew up the report, datasets risked losing out to research papers in the REF because claims for the significance of papers were often supported by bibliometric data, which was favoured in the “accountancy culture we are all prey to”.
But the report suggests that datasets could also be assessed using quantitative measures of their citation and reuse.
Professor Boulton noted that where databases were made citable, they often received an “order of magnitude” more citations than the first paper based on them.
The Royal Society report also urges journals to require authors to make available - in institutional or subject repositories - datasets on which the conclusions of papers depend. Funders should also require datasets produced with their funding to be publicly accessible, and should cover the costs of “curating” them, the report argues.
It points out that the costs involved are often “demonstrably smaller than the costs of collecting further or new data”.
Meanwhile, universities should recognise data communication as “an important criterion for career progression” and should withhold data created by its researchers only “when it is optimal for realising a return on public investment”.