Virtually brought to book

Google and the Myth of Universal Knowledge

October 6, 2006

Until very recently the world, according to the worldwide web, began in 1996. While constructing even a highly specific search inquiry may have rewarded you with tens of thousands of potentially relevant web pages, the overwhelming majority would have comprised texts generated during the ten years of the web's mainstream existence.

Now the same people who were responsible for designing the world's favourite search engine are leading the way in recognising that what the web needs next is a critical mass of reliable, authoritative, fully searchable content covering those parts of human knowledge recorded before Bill Clinton's second term as US president.

There were, of course, search engines before Google. The most successful, such as AltaVista, were perfectly adept at trawling the web and retrieving the occurrences of selected keywords. What they were much less successful in doing was presenting search results in a prioritised order. It was a great coup when Larry Page - who founded Google in 1996 with Sergey Brin when both were graduate students at Stanford University - devised the PageRank system that manages the random superfluity of search results by arranging them in a hierarchy, making sure that the most important are displayed at the top of the first screen.

Google then turned its attention to improving the web's content. Page and Brin knew that it was not necessary to reinvent the wheel: there were people and organisations that for some time had been selecting, storing and organising text-based information sources.

In 2002, Google began talking to university librarians. Initially the company did not venture far: its first thought was the University of Michigan, Page's alma mater. But by the time it was ready to go public, at a press conference in December 2004, Google had negotiated partnerships with five major libraries for an unprecedented retrospective digitisation programme of all their holdings.

With its Google Print for Libraries project (inevitably shortened to Google Library), Google would sponsor the scanning of the entire 7 million volumes of the Michigan collection and more than 1 million 19th-century volumes from the Bodleian Library at Oxford University, plus some 40,000 items from the Widener Library at Harvard University, 12,000 from the New York Public Library and an unspecified number from Stanford.

In this virtual library, books out of copyright would be available in a full-text, fully searchable and downloadable form. Material in copyright would be fully scanned and thus fully searchable, but for the time being, only "snippets" of the text would be displayed, alongside details of the libraries or shops where the complete text could be found. Google would then make the contents of its library freely available to anyone anywhere with access to a computer and an internet connection.

Such a project, aimed with such ambition at enhancing the most liberal access to human knowledge, could surely only be perceived as being to the universal good? Well, actually, no, it would seem.

The project immediately stirred considerable controversy in the library and publishing worlds. Some qualms could be countered by corporate repositioning: the audacious universal "Google Library" is now the less threatening "Google Book Search", which aims not to render libraries obsolete or publishers bankrupt but "to create a comprehensive, searchable card catalogue of all books in all languages that helps users discover new books and publishers discover new readers".

The most cogent critic, though, remains unsilenced. Jean-No l Jeanneney, the president of the Biblioth que Nationale de France since 2002, admits to receiving the news of Google's 2004 press conference as a healthy jolt. Within three months, he had written and published a counterattack, Quand Google Défie l'Europe: Plaidoyer pour un sursaut , of which this University of Chicago Press version is both a translation and an update.

His concerns are less to do with copyright infringement than with global cultural domination. Google's unsystematic digitisation from collections held by exclusively anglophone organisations can only exacerbate the bias that already unfairly favours the Anglo-Saxon world-view. Even if books in languages other than English are eventually scanned in representative numbers, Google's algorithm for ranking its search results, based as it is on existing popularity, will always favour the strong, keeping English texts at the top and all others invisible on the pages to which no one has the time to scroll. And what if Google's control of the world's online literature should ever fall into the hands of a less benign group than the current northern Californian liberals?

Jeanneney's pamphlet is a little too unfocused for a truly effective polemic or call to arms, but the questions he raises are vital for those concerned with the production, accumulation and dissemination of the world's knowledge, and it is good to have them so passionately revitalised for the digital future.

Christopher Phipps is development librarian, The London Library.

Google and the Myth of Universal Knowledge: A View from Europe

Author - Jean-Noël Jeanneney
Publisher - University of Chicago Press
Pages - 96
Price - £11.50
ISBN - 0 226 39577 4

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please Login or Register to read this article.