Online archive celebrates a decade of data dissemination

August 3, 2001

The Los Alamos E-Print Archive has helped thousands of physicists and mathematicians worldwide to gain access to a wealth of research over the past ten years. Julia Hinde looks at its success.

A revolutionary e-archive that has transformed communication between physicists celebrates its tenth birthday this month with a 3,000km move and talk of expansion into new disciplines.

Every morning, thousands of theoretical physicists and mathematicians - in labs from Manchester to Moscow, Tucson to Tehran - log on to the Los Alamos E-Print Archive, at arXiv.org, to find out what is new in their field. Covering areas of physics, mathematics and computer science, the web-based archive is the brainchild of Paul Ginsparg, a high-energy physicist at Los Alamos National Laboratory, the top-secret US weapons lab. The archive receives more than 200 new submissions each day, many of them original research papers.

Each submission is indexed, stored and put onto the website. Then, whether in Delhi or Dallas, physicists, regardless of affiliation or level, have equal access to all 170,000 submitted research papers - many of which are uploaded in advance of peer review. Similarly, regardless of location, academics have equal opportunity to publish their own material on the site or to make additions to others' work. All without having to pay a journal subscription.

With an immediate turnaround of papers, minimal overheads compared with traditional paper or online journals and without the reams of paper associated with the hard-copy circulation of pre-print documents, arXiv.org has become a cost-effective, quick way for the physics community to share findings.

The archive, which this year expects to receive a further 35,000 papers, is moving to Cornell University in upstate New York with its founder, where there are plans to explore "the extension of this idea into other disciplines", notably biological sciences.

Such plans raise fundamental questions about the future of scientific communication and publishing.

There is little doubt about the archive's impact over the past decade in the physics community. "It is an exceptionally important medium," explains Jonathon Halliwell, reader in theoretical physics at London's Imperial College. "I don't know what we would do without it." According to Halliwell, the archive is the "universal medium of communication in theoretical physics".

Though most physicists still submit to journals, with peer-reviewed publications in prestigious journals still carrying weight when it comes to grant applications and securing jobs, many will submit unreviewed pre-prints to the archive at the same time as submitting to a journal and thus get their work into the public domain as quickly as possible. As such, arXiv.org provides a function that works in parallel to the peer-review system.

Halliwell adds that the archive - which receives 2 million visits a week, more than two-thirds of which are from outside the US - is an extension of the pre-print culture long embedded in physics. Here, traditionally, unreviewed hard copies of research would circulate in the community long before papers appeared in peer-reviewed journals, which could sometimes take up to 12 months.

The archive has contributions from academics in 100 countries, with 6 per cent of submissions from the UK. It has also levelled the playing field in physics. Researchers in less-developed countries, where paper copies of journals may arrive months after publication, if at all, have the same access to research reports as researchers in more industrialised countries, while those at less prestigious institutions - or at an earlier stage in their career - have as much chance of getting their views read as those in top-ranked universities.

But there are questions about whether the archive can work in such a form in other fields. Some point to the lack of a pre-print culture in the biological sciences, while Ginsparg notes differences between disciplines and their communities. Nevertheless, he believes the current system of scientific publication displays "intrinsic instabilities" across all disciplines, and suggests "much further evolution in the coming decade".

"The question crystallised by the new communications medium," says Ginsparg, "is whether this arrangement remains the most efficient way to organise the review and certification functions, or if the dissemination and authentication systems can be naturally disentangled to create a more forward-looking research communications infrastructure."

Ginsparg, 45, who as part of his new post at Cornell will look at future directions for the archive, has his own vision for future science publishing. He suggests an electronic, multi-layered global network, with scientists putting their unreviewed raw data online at one level, while future electronic journals perform an overlay role, acting as pointers to selected entries at the scientists' data level. As such, readers wanting a structured guide to the mass of raw, unreviewed data could access the network, at some price, via the journal level where papers would be filtered and organised, while those wanting to trawl raw data could access it free of charge at the data level.

"Ultimately, issues regarding the correct configuration of electronic research infrastructure will be decided experimentally," Ginsparg says.

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Sponsored