The ambitious US plan to build a universal digital library has already produced some useful spin-offs.
We are two years into the four-year Digital Library Initiative, a $24 million research programme sponsored by three agencies of the United States government.
The National Science Foundation (NSF), Defense Advanced Research Projects Agency (Darpa), and National Aeronautics and Space Administration (Nasa) are backing an ambitious attempt to build information resources of all kinds in digital form. These may be articles in incompatible formats or satellite images that devour storage and bandwidth. They will be distributed among multiple servers. The aim is to make these resources available anywhere, any time, through interconnected networks and systems with appropriate and necessary safeguards for intellectual property and electronic commerce. The complexities of working across these multiple and often incompatible systems are to be made invisible to the user.
Six universities have been funded. Each university has created a testbed of digital materials and taken on a range of research issues, some complementary and some overlapping (see sidebar). Each has established partnerships with corporations, state agencies, public and private libraries, professional associations and publishers. Darpa has contracted with the not-for-profit Corporation for National Research Initiatives (CNRI) to assist the digital library research community in a coordinating role, one of whose activities is D-Lib Magazine.
The project teams (which can comprise 20 to 30 people each, counting faculty, dedicated staff, graduate students and partners) meet formally twice a year, announce findings and prototypes at their Web sites, and engage in joint experiments to test the ability of different systems to talk to each other. Interoperability, the idea that different services, systems, formats, and media can talk to each other in a way that appears seamless to the end-user, is central to building digital libraries - increasingly so, given technological development that has, at times, outpaced the standards process.
How to design the fundamental structure of a digital library to accommodate future technological change is the question driving the research at Stanford and Michigan. But the theme of traversing the seemingly incompatible shows up in all six projects, whether they are looking at multi-format combinations of text, images, and equations in engineering and physics journals like the UIUC team, or building a digital video library like Carnegie Mellon's Informedia Project.
The teams also exchange tools. For example, Informedia is conducting its studies of the use of video in middle schools within a framework articulated by the UC Berkeley team and with an online questionnaire similar to one developed by UC Santa Barbara's Alexandria Project, even though Berkeley is examining how researchers use environmental information and Alexandria is building a digital library of geographically referenced materials and services.
The DLI is not the only home for this type of research in the US. But one effect of the DLI programmatic umbrella has been to create a sense of critical mass and focus across disciplines and within the broad community of researchers. With public, private foundation and university funding, Michigan has sponsored a broad programme of digital libraries research funding. The DLI project, says Daniel E. Atkins, project director and dean of the new school of information, is "a major part of a strategic commitment of the university to create and use learning and scholarly communication environments to provide new competitive options for learning and research for students and faculty." The DLI project at Illinois has developed primary partnerships with 15 corporations, scholarly journals and publishers and is expected to become a standard service of the university's new Grainger Engineering Library. The service will be extended to a consortium of major universities.
Barry Leiner, who played an important role in the selection process and initial organisation of the programme while he was at Darpa, says that within the DLI projects, partnerships "between the library and computer science communities, between academia, industry, and government are important and have been quite productive in several ways, including but not limited to technology transfer." It means, for example, that librarians who have inherited a tradition of public service are cooperating with publishers who have a tradition of profit, and that some services, such as cost recovery, can be built as an "enhancement" of a common technological base.
For the researchers, the definition of digital libraries offers sufficient research scope with "real purpose" according to Stanford's project director Andreas Paepcke. "All the various camps can see benefits to the common goal that are contributed by the different disciplines such as database design, user interfaces, sociology, [and] library science," he says.
Computer science is nevertheless the first among equals. The funding comes from the computer science branches of the respective agencies, and the principal researchers either come from computer science or have "strong associations" with the discipline. They tend to "build library systems I want for myself," resulting in some interesting biases: the research "lives in English; there is some work in mathematics; [and] no work in foreign language support." But the computer scientists also bring experience of articulating abstract issues, focusing multidisciplinary research, and managing complex projects with multiple partners.
The projects have jelled quickly, and results are coming into use across different user communities, not just different research communities. In mid-August, for example, the State of California announced Lupin (Land Use Planning Information Network), one of the projects of the partnership between the state's Resources Agency and UC Berkeley. Lupin provides network access to large, remote, digitised collections of environmental information ranging from maps and satellite imagery to formal reports and documents. The system employs a map-based interface - a critical feature for land-use planners - and a document model called "multivalent documents", which allow users to look for information spatially and to "layer" information by annotating the data or perhaps manipulating a table within a single item, once they have found it. "The importance of and interest in new kinds of electronic documents has surprised us all," Berkeley's Robert Wilensky says. The multivalent document idea "integrated so many diverse aspects of our work" and created "a tremendous opportunity to create new forms of documents rather than just mimic paper with electrons." He says the DLI project has re-oriented much of the research agenda of the computer science department which he chairs, with vision researchers, for example, looking more closely at issues related to image retrieval.
Lupin is considered a good illustration of what the DLI is about: a working system that comes out of a research programme, builds on existing technology, and stimulates interest within the research community. The underlying communications technologies are, of course, based on the Internet, and are being developed globally by different communities.
The sometimes dizzying change in the underlying technologies makes for exciting times but threatens built-in obsolescence for multi-year projects. Says Stuart Weibel of OCLC, a not-for-profit computer library service and research organisation that is neither a DLI grantee nor partner.
So far, the DLI projects appear to have ridden the wave. For example, developments of intranets, or local applications of Internet technologies, have had interesting implications for the Informedia project. Project director Howard Wactlar says intranet applications of digital library technologies have unexpectedly opened up a new set of possibly lucrative applications as corporations seek to capture and re-use information contained in videotaped meetings and conferences.
Informedia's research lets users search video collections of news clips or interviews with Holocaust survivors. Relevant results are returned initially in summary form with text, still images and sound clips so that users can browse the results before making selections. "We find that abstraction and summarisation are extremely important", Wactlar says. Especially for video but more generally for the Web, these tools reduce demand on the network, save users time and detect subtle differences in a collection of related material.
Users search collections differently according to the kind of question they have posed. They looked for discrete information in collections of news clips but searched interviews of Holocaust survivors looking for what Wactlar calls "the story" or the emotionalism of the narrative. What matters is "being able to extract who said what to whom" - which feeds into the type of information that corporate researchers may wish to identify.
The speed with which Web tools come along simply saves time. "When the full significance of the Web became apparent, providing the tools by which we could get things to people, we were able to accelerate the development of our digital library," says Alexandria's director Terence R. Smith. With cooperation from the UCSB library and its Map and Imagery Laboratory, he hopes to have "a truly operational digital library" online by late 1998.
And if we build it, will they come? On balance, the signs are they will. Smith reports that his team is talking to the state of California about using Alexandria as a core for a distributed digital library system "first for the [nine campus] University of California, and possibly with extension to the state."
Stanford has brokered an agreement among major Web search engine providers on a specification for automatic resource discovery and search result merging.
All DLI projects have begun planning how they will manage the digitised collections after the grants expire in two years. Their proposals include a broad mix of future hosts from academic libraries to commercial ventures. Partnerships of various kinds are the key, says Michigan's Atkins. It is easy, though, to over-hype these projects; like others, he cautions, "we are now building 'horseless carriages'."
Amy Friedlander is editor of D-Lib Magazine. The opinions expressed are those of the author and do not reflect views of the Corporation for National Research Initiatives or the US government.