The internet has revolutionised humanities research. But has the development of ever-more sophisticated online resources freed up scholars to explore new ideas, or made them slaves to the digital machine? Matthew Reisz reports

December 8, 2011

It is hard to overstate the scale of the revolution that goes under the name "digital humanities". Someone somewhere is digitising every book ever written by a Mexican immigrant to the US, every reference to a theatre on the south bank of the Thames in Elizabethan England, or the complete contents of an obscure local library. Ancient inscriptions are being catalogued and deciphered, virtual theatres are taking shape, Wittgenstein's letters are being put back into context, and we can now find databases listing every 18th-century clergyman and every single person recorded in Anglo-Saxon Britain.

A remarkable project titled Cultures of Knowledge: An Intellectual Geography of the Seventeenth-Century Republic of Letters seeks to reconstruct pan-European intellectual networks by creating a modern equivalent. Many leading thinkers of the time were forced by war to flee from their homes and so left their papers all over the Continent. The digital revolution and the collapse of the Soviet Union, says Howard Hotson, professor of early modern intellectual history at the University of Oxford, have enabled a team based in Oxford (with support from the Andrew W. Mellon Foundation) to build "radically multinational forms of international collaboration of a kind which was effectively impossible before", so as to reassemble in virtual form long-scattered learned correspondences. The project should transform the study of topics such as the Scientific Revolution.

The 18th-century equivalents of anthologies of cat poetry or poems to heal a broken heart were popular miscellanies of verse and song. Given the sheer bulk of the material - around 40,000 poems in some 1,200 surviving volumes - and the fact that it reflects what ordinary people actually wanted to read, it can clearly tell us a great deal about the literary marketplace, the cultural context for canonical authors, plus changing tastes and fashions. However, it has hitherto been largely neglected by scholars. The Digital Miscellanies Index is designed to remedy that.

A three-year project, the index is hosted by the Bodleian Library's Centre for the Study of the Book, funded by the Leverhulme Trust and led by Abigail Williams, Lord White fellow and tutor in English at St Peter's College, Oxford. Academic experts set the basic parameters and supply attributions for anonymous poems. Once the resource goes live in 2013, those studying 18th-century literature and social history will be able to track trends through almost instant graphical representations, enabling them to ask many big questions that would not have been possible before. They will also be able to ask more trivial questions about, for example, the changing popularity of poetry about cats.

It is unclear how far the index will be consulted by non-specialists, but outreach and "impact" are built into the project.

There are podcasts exploring the miscellanies' treatment of such varied themes as Christmas, cricket, flatulence and tobacco. The team includes a voice-and-fiddle duo called Alva, who will take part in concerts of Yorkshire music, a "Georgian evening" and an evening about 18th-century sport, all arising out of material in the miscellanies (they will also record a CD). Williams has secured a British Academy mid-career fellowship for such public-facing activities, which will also enable her to carry out the research for a book. And there has already been press interest in whether a smutty ditty attributed to Milton in one of the miscellanies was actually written by him, or was a deliberate attempt to discredit him.

Projects such as these - not to mention the Proceedings of the Old Bailey, 1674-1913, which includes records of close to 200,000 criminal trials and has generated countless scholarly "outputs" as well as the popular BBC legal drama, Garrow's Law - are just a few powerful examples of what can be achieved in the digital humanities. Yet many people have raised questions about whether there aren't just as many pointless projects and whether the field always justifies the hype that surrounds it.

The most basic concern, in the words of Katharina Lorenz, associate professor in classical studies at the University of Nottingham, is that "much money has been spent on creating unused resources".

Another researcher, who asked to remain anonymous, recalls looking at dozens of funding proposals "to digitise some existing print or manuscript source material. Whenever I raised the 'What's the point?' question, I was always told: 'Because if it's digitised you can do all these really interesting things with it.' But oddly I never seemed to see a proposal to do any interesting things, I only ever saw proposals to make more and more digital resources.

"I felt that digitising had become an end in itself, that too much of the limited research funding available in the humanities was going into it, and that consequently people whose aim was to get funding were being given a perverse incentive to limit themselves to what often seemed like a very mechanical and low-level form of research. Meanwhile, there was less money available to fund researchers wanting to investigate a substantive question or develop an original idea."

We are now witnessing what Martin Wynne, Oxford University Computing Services liaison at the Oxford e-Research Centre, describes as "a move from research leave to research grants, with academics required to hire staff and manage teams". This is obviously more congenial to some people than others, and critics argue that it is a trend driven far more by financial than scholarly goals.

But there is widespread agreement that the developing discipline and funding regime have overcome some of the teething problems. "Digital resources and infrastructure are developed to solve scholarly problems, not as ends in themselves," says Hotson, "to serve our own projects and interests on the assumption that other scholars have very similar projects." This avoids the danger of what amounts to academic "deskilling". And while some earlier initiatives by researchers may have produced obscure and sometimes self-indulgent resources that helped them but were of no use to anybody else, Wynne argues that "reusability, sustainability and visibility" are the guiding principles today.

So how should we regard some of the more grandiose claims that are made for the digital humanities? Open-access projects, we are constantly told, democratise knowledge by making it available to anyone with a computer. "Far from being geared solely to academic questions," says the website for Linguistic Geographies: The Gough Map of Great Britain, a chart that is thought to date back to the 1370s, "the project team was keen to ensure that our research findings reach the widest possible audiences, not least because maps are enduringly popular objects and always capture the imagination."

New resources are also said to enable us to interrogate data in different ways and to ask fresh questions, including some that were previously not even imaginable. Since we can never tell what the scholars of the future are going to be interested in, almost anything might turn out to be useful. And if an academic discipline is in decline, digital tools can provide a way of reviving interest.

Such arguments are almost incontrovertible in the abstract, and are amply justified in particular cases, but often seem to be accompanied by very sketchy notions of what might constitute success or failure. Is it too crude to expect a database requiring x thousand pounds of research funding to generate so many thousand hits, five monographs, three spin-off radio programmes and 20 newspaper articles? And when does it become a dubious use of public money to create ever-more-sophisticated resources for disciplines that seem to be in terminal decline?

The issue of democratisation is also intriguing. It can raise major moral and political questions in relation to what books and journals are available in the developing world. And there does seem to be evidence that even highly obscure topics generate more interest than one would ever expect. DigiPal, the Digital Resource for Palaeography, based at King's College London, focuses on the study of 11th-century handwriting. Although the website is still rudimentary, it has already had about 2,000 hits over the past three to four months.

Even more striking is the Ancient Lives site, which takes visitors straight into a tutorial with the words: "This is a piece of text discovered in Egypt, written over a thousand years ago. We'd like you to help us to read it." About 1 million fragments of papyrus were brought back from Oxyrhynchus around the beginning of the 20th century, of which only 1 per cent have so far been published. Now crowd-sourcing techniques pioneered by astrophysicists have enlisted tens of thousands of amateur volunteers in transcribing the scraps of text and recording their dimensions with "virtual tape measures". Sequences of even a few letters can then be instantly matched with the complete corpus of surviving Greek literature. If anything such as a new Gospel turns up, it is likely to be spotted very quickly.

Yet in many cases it is hard to know what democratisation means. The Online Chopin Variorum Edition, funded by the Andrew W. Mellon Foundation, is directed by John Rink, professor of musical performance studies at the University of Cambridge, with the technical development provided by King's College London. Chopin had a habit of tinkering with his music and selling different versions of his mazurkas and preludes to publishers in different countries - one piece was made available in 80 distinct forms.

Many of the modifications are so small as to be inaudible, admits Paul Vetch, senior lecturer in digital humanities at King's, but there are also cases of "changes in accidentals that would have a massive tonal impact and change the feeling of a piece of music". Although the project largely "arises out of John's personal dissatisfaction over communicating to other scholars the complexity of the publication process", there was also "a democratisation aspect in building the resource in the first place. We had to bring all the material together to allow the scholarship to be done."

This is surely "democratisation" in a pretty rarefied sense. The overwhelming majority of the human race can have little idea what a "variorum edition" is. There can't be thousands of Chopin scholars desperate for this material and, while musicians are also interested, it isn't yet technically possible to choose one's preferred variants and "turn off" the alternatives to create a performing edition. If democratisation means anything, it must surely mean more than just making freely available online something of interest to only a handful of specialists.

Andrew Prescott, who will head the department of digital humanities at King's from January, is an eloquent advocate for the field. "I've been in this game as a librarian and then an academic for over 30 years," he says, "and the transformation is just breathtaking. Now there's no need for lengthy journal searches, which required an incredible amount of shoe leather.

"Digital technologies can now be used to investigate cultural treasures [such as maps, manuscripts and literary classics] with new tools as well as going into unknown and undiscovered territory. We want to reboot and bring the traditional scholarly editing skills into a completely new environment."

Although there is little dispute about "the possibilities of the technology, which we've now got fairly sharp ideas about", Prescott acknowledges that we are "still in an era of confusion in the digital environment around the sort of business models we apply to questions of impact, depth of data and so on. We may not yet have the tools to understand how we make those decisions about the depth of data."

This is quite an admission. Someone publishing a hard-copy atlas is likely to have views about whether it needs to reach down as far as every village, every street or every house, and whether it needs to include figures for annual temperatures and rainfall, because they have some sense of who is going to use it and for what purpose. And it is hard to know how to make such decisions without a clear end-user or users in mind. Grandiose statements about "the widest possible audiences" are no substitute. Is Prescott implying that people creating resources in the digital humanities often fail to think through adequately who they are for?

An interesting example here is the Digital Image Archive of Medieval Music, now based in Oxford. This started as a digital repository in 1998, and aims to make available the entire corpus of polyphonic medieval and Renaissance music, together with the latest scholarly metadata. It has preserved images of rare and damaged manuscripts and used enhancement techniques to recover music invisible to the naked eye. Although it is obviously a tool for scholars, it has also been used by performers and by teachers wanting, for example, to illustrate a lesson about the Battle of Agincourt with an image of the Agincourt Carol.

This is an entirely good thing, of course, particularly given the decline in the study of medieval music within universities, but unfortunately this wider use wasn't predicted. In the words of a recent impact report, "the current front-end website is directed towards the 'expert' user: a scholar who already knows to some extent what to look for, and what could be found on coming to the DIAMM website. Focus groups and feedback from the actual user community - a much broader group extending beyond academia - has informed the development of a new website front end, which will not only benefit the expert user and enhance their experience of the resource but will also attract and embrace non-expert users without patronizing them."

To be more specific, the site's main classification system was sorted by the country and city where the manuscripts are now held. For the director Elizabeth Eva Leach, professor of music at Oxford, who claims her "knowledge of border changes ends around the 14th century", this was not very helpful. For an ordinary amateur music lover, it was positively useless. Fortunately, this is now being put right and a marvellous resource will now be far more accessible to non-specialist end users, but it is surely a good pointer to some of the issues Prescott flags up.

So what about the scholars who use, or could use, these resources? On one level, the battle is won. It is a rare researcher in the humanities today who doesn't draw frequently on digital resources, as well as using the internet to check factual details or read texts that are long out of print. But are they getting the resources they need most or is there still a mismatch between supply and demand? And do they also have reservations about current trends?

"There is a lot of snobbery about admitting to making substantial use of these digital resources," says Joanna Bourke, professor of history at Birkbeck, University of London, "and it is wholly misplaced. [Digital] allows for building up ever more complex hypotheses. It enables researchers to quickly check the assertions of others by doing some simple searches. It will never replace physically entering an archive, and it would be to the detriment of our disciplines to imagine that it eliminates the need to travel to other places.

"But I think most serious researchers can be given credit for being able to think imaginatively about their topic, and can be assumed to possess the passion to embrace the new technologies while throwing themselves into the exuberant pleasures that come from physically touching the paper and other objects that provide clues to the very different ways of thinking in the past. That is the bliss of history."

Yet within this broadly positive picture, researchers in different disciplines raise a range of caveats. Although "new kinds of digital resources can enable new kinds of scholarship", says Mark Turner, professor of 19th- and 20th-century literature at King's College London, "that's tended not to happen too much in my own discipline so far". While he admires a major project currently in development at the University of Wisconsin to analyse serialisation techniques in the novels of Charles Dickens and George Eliot, he sees "other innovative projects" as "fairly limited in focus".

Turner also has a more specific worry about "a false sense of security in some digital projects" that arises from an initiative he was involved in "to digitise a series of 19th-century materials (around 100,000 pages of newsprint). Among the most interesting things that came out of the project was the limit on what digital technologies can do with 'older' materials. Because these can be difficult to read for optical character recognition machines, levels of accuracy cannot be guaranteed and much is misread in the process.

"One result is that, in the finished resource, you may think you are doing a complete search for a keyword such as 'Dickens', but in reality much is likely being missed because of the limitations imposed by the mechanical process. And of course, it's impossible to check by hand all those thousands/millions of new digital pages."

Tamson Pietsch, lecturer in imperial and colonial history at Brunel University, works on "the story of imperial academic connections in the first part of the 20th century", a topic she believes has been neglected precisely because "the material I work on (which ranges from personal papers to institutional records) is held in university archives across the world" that she has had to travel far to access.

While Pietsch acknowledges that "digitisation does reduce some of the obstacles imposed by distance", for example in the case of "the University of Sydney's digitisation of all its university calendars (and some wonderful photographs) and the National Library of Australia's Trove index, which collates pretty much everything online about Australian history", she stresses that "the bulk of the material I need remains available only in bricks-and-mortar institutions, and I can't see that changing".

The danger for Pietsch is that "what gets digitised drives scholarship. I'm sure since its release there have been a disproportionate number of articles using material from The Times' digital archive." It is also safe to assume that poorer countries, like poorer universities, are going to be slower to digitise their archives, with all the potential for distortion that introduces.

Gill Perry, professor of art history at The Open University, runs the Open Arts Archive with 17 collaborating museums and has seen her own research transformed by digital resources. However, she also points to a number of problems.

Copyright issues often mean that there are no digital records of exhibitions. Students start to believe that an image viewed online can be a genuine substitute for seeing art in the flesh. Digitisation also creates an implicit hierarchy, with photographs reproduced better than paintings, and paintings better than sculpture, while the sorts of installation that viewers are expected to walk around and through simply cannot be experienced online. Only the work of digital artists is available exactly as intended. Everybody knows that violence and nudity feel very different in the cinema and on stage, and there is something of the same difference between real and virtual artworks.

"We were all very seduced by digital a few years ago," says Perry, "but it shouldn't be a substitute for more creative, interpretative work."

