Researchers at the Sheffield Humanities Research Institute are building an enviable reputation for their joint investigations into classic manuscripts. Tony Durham reports.
Xpyment" has the Microsoft Word spellchecker gobsmacked. "No suggestions," it gasps. Equally puzzled? Try an experiment (hint) and read the word aloud.
Chaucer, the printer Wynkyn de Worde and an industry of nameless medieval scribes managed very well without spell checking programs, leaving it to 18th-century lexicographers to make the world safe for Scrabble. "Xpyment" is one of three variants of "experiment" that occur in extant versions of Chaucer's Wife of Bath's Prologue, and it is one of the more restrained examples of creative spelling. When copying the same poem the scribes excelled themselves with 172 variant forms of the verb "to be". Not many people outside the University of Sheffield know that.
We should be grateful for the scribes' rumbustious way with the alphabet. It leaves traces of a text's history that are, for literary scholars, what mitochondrial DNA is for evolutionary biologists. Researchers at Sheffield's Humanities Research Institute, with colleagues at four other universities and two national libraries, are putting Chaucer's Canterbury Tales through the digital mill, electronically chopping and shuffling the aged vellum in ways that would be neither feasible nor ethical with scissors and paste.
Medieval spelling was uninhibited but not random.
"You can look at the spelling database and see that different scribes have different conventions," says the institute's director Norman Blake. Computer analysis provides important clues to the way books were produced before the printing press, though it leaves unanswered such questions as whether the scribes travelled to books they wished to copy, or ordered them on inter-library loan.
Focus on wording rather than spelling, and the digitised text yields clues to the family tree of the 80 manuscripts and four early printed editions. The extant manuscripts are usually thought to date from after Chaucer's death, but computer analysis has raised the exciting possibility that we possess a text that Chaucer himself could have seen. Cambridge University Press is publishing the digitised Chaucer on a series of CD-Roms, which include scanned images and transcribed texts with SGML tagging - an international standard which allows scholars to process the text with their own choice of software. Latin glosses are included, as are transcription notes for words where the transcription was difficult or doubtful. There is enough space on a single CD for all versions of one tale, or a single manuscript of all the tales; CDs of both kinds are being produced. Scholars who want to work on the entire database simultaneously may have to wait until it is published on the World Wide Web.
The institute, a partnership of the university library and the faculty of arts, grew out of Mark Greengrass's project to digitise the papers of the 17th-century scientist Samuel Hartlib. The Hartlib Papers on CD-Rom, published by UMI of Ann Arbor, Michigan, is selling "slowly but surely": 30 copies of the two-CD set have gone to half a dozen countries at approximately $5,000 each. There are places where more pages get scanned and more megabytes of text are committed to CD-Rom, but Sheffield is not in competition with overseas data entry bureaux or even with higher education's million-page-a-year digitisation centre at the University of Hertfordshire. "We decided that it was not our business, storing and shovelling other people's data," explains Professor Blake. "The focus here is on high quality research rather than being very good technically for its own sake."
The institute is not possessive about projects but allows their parent departments to own them, meaning that research income, royalties and research assessment exercise returns all accrue to the department.
A research reputation opens doors worldwide. Alain Goulet of the University of Caen was keen to collaborate with David Walker, professor of French at Sheffield, when he saw how computers could help him master the complexities of Andre Gide's manuscripts and trace the author's creative process. Goulet's study of Les Caves du Vatican and Walker's work on Les Faux-monnayeurs meshed perfectly because, as the manuscripts reveal, the two novels are really a single narrative with the names of characters changed. The British Academy's humanities research board funded an institutional research fellowship, Sheffield's third, for Pascal Mercier to pursue this work.
There are some unexpected gems in Sheffield's own library. It is the home of the National Fairground Archive, which Vanessa Toulmin is using to develop MAs in popular culture and social studies of show people. The biochemist Sir Hans Krebs worked at Sheffield for more than 20 years and left the university all his papers, including a letter from Nature rejecting the paper on the tricarboxylic acid cycle which later won him a Nobel prize.
Another gem is Sir Thomas Beecham's music collection. University librarian Michael Hannon is keen to develop a website devoted to the great conductor. "Our philosophy," he says, "is to get our special collections to earn their place on the shelves in terms of supporting our research themes."
Collaboration with the British Library has flourished since 1993 when the university and the library signed a concordat, an altogether less mercenary thing than a contract. As a result of the concordat BL staff have done PhDs at Sheffield, with the university covering their fees, while the BL contributes by continuing to pay them for the time they spend on research. Meanwhile staff and students from Sheffield have worked at BL sites on the Canterbury Tales and other projects including the cataloguing of the papers of the theatre critic Kenneth Tynan and the theatrical agent Peggy Ramsay.
The Canterbury Tales are dwarfed by the five million words of anti-Catholic polemic penned by John Foxe in the reign of Elizabeth I, The Acts and Monuments of the English Martyrs. This chronicle of the persecution of Protestants under Queen Mary required phenomenal research, and Foxe is perhaps under-rated as a historian. He painstakingly recorded the Catholic arguments, adding marginal comments of "abominable blasphemy" or "monstrous heresy" to prevent any possible misunderstanding.
In 1993 the British Academy set up a project to produce a new edition of Foxe. The project, led by history professor David Loades, began at the University of Wales, Bangor but after two years moved to Sheffield where John Young and Susan Smith are undertaking the massive task of transcription.
Oxford University Press is preparing to publish digital facsimiles of the four editions published in Foxe's lifetime, on a CD-Rom set. According to Professor Blake "the principal interest for Foxe specialists will be to see what he took out and added, and work out why".
Research fellow Rhian Davies is comparing manuscripts, proofs and published versions of novels by Benito Perez Galdos. She is interested in the background to each novel and the way it was produced. Revered in Spain but less well known internationally than his contemporaries Tolstoy, Dickens and Balzac, the 19th-century writer nevertheless has some fervent British fans. They include local boy Lord (Roy) Hattersley who came to the unversity in 1997 to lecture on Galdos before an audience which included the Spanish ambassador.
Galdos's novel Torquemada en la hoguera was written as a serial for the magazine La Espa$a Moderna and then adapted for publication as a book. Dr Davies finds the rapidly-penned serial version "a lot more imaginative". Metaphors are toned down in the book; in one place "blackness" is replaced by "sadness". "I have never really forgiven him for that," Davies says. It was, she believes, Galdos himself and not some nameless editor who excised the metaphor. Galley proofs of the book are covered in the author's ink. "A lot of people don't realise nowadays that the finished work is something that has been worked on and revised," Davies says.
She is constructing a Galdos website and hopes to build a big database which will allow scholars to trace characters, places and ideas through the author's novels. Maps and pictures of old Madrid are already on the site.
"We want to situate it in the context of 19th-century Spain," Davies says. She intends the site to carry a mass of supporting material alongside the texts themselves.
Digital scholarship in the humanities has come a long way since the early days of textual analysis of the who-wrote-Shakespeare variety. The visual quality of scanned images is no longer the worry it was.
"Often digitisation helps you to read manuscripts that would otherwise be very difficult to read," Professor Blake says. He expects that the institute's projects will make increasing use of sound and video. There is, after all, now a century's worth of recorded sound and moving images to study - no longer a mere postscript to the 400-year supremacy of the printed word.