Scientific Data will publish “data descriptors”: citeable descriptions of the contents of data sets that will contain structured information created in-house by NPG.
However, the platform will not host the datasets themselves. These must be made available via other public databases: ideally ones which are “recognised” within their research communities.
The platform, which will open for submissions in the autumn and launch in spring 2014, will focus initially on the life and environmental sciences, before expanding into other areas of the natural sciences.
It will not be a condition of publication that any papers referring to the datasets be published in NPG journals. A spokeswoman for NPG said it would be willing to coordinate the publication of data descriptors with the publication of related papers to avoid “compromising” the latter.
Jason Wilde, NPG’s business development director, said: “Over recent years researchers, funders and learned societies alike have been calling for new ways to make scientific research, and research data, more available, reusable and reproducible.”
The descriptors will be published under a creative commons licence. However, some academics have criticised NPG’s decision to charge a higher fee for the most permissive CC-BY licence, which allows all reuse subject to the original author being credited.
Publication under a CC-BY licence – which is also required by Research Councils UK for papers published via the gold open-access route – will cost £650 after December 2014, compared with £585 for licences that prohibit commercial reuse.
An NPG spokeswoman said the price differential “fairly represents NPG’s loss of exclusive commercial rights. By authors granting these rights to NPG, it will enable us to maximise future commercial opportunities, such as commercial reprints.”
She added that “we felt it right that these benefits should be returned to our authors, and that was the basis for reducing” the price of the non-commercial licence.
She pointed out that the datasets themselves were available under the terms of the hosting repository; these terms “tend to be very open”. Metadata associated with the data descriptors would also be available under the most unrestrictive Creative Commons licence available, known as CC0.