The Cambridge Encyclopedia of the World's Ancient Languages
June 18, 2004

This encyclopaedia describes some 50 languages and dialects of antiquity, their writing systems, phonology, morphology, syntax and cultural and linguistic context. The languages range from Sumerian to Epi-Olmec (better known as Isthmian) and include many of the very names of which most readers will not have heard, such as Urartian, Eblaite and Palaic. But the more familiar languages are, of course, treated too: Hebrew, Greek, Latin, Sanskrit, Aramaic, Phoenician and so on. The introductory chapter gives an overview of undeciphered languages such as Proto-Elamite and that of the Indus Valley seals, and of insufficiently attested languages such as Raetic and Lemnian, once spoken in the Alps, of which just enough remains to allow experts to squabble about their filiation: Indo-European or Etruscan?

We are warned in the introduction that this work has been "designed primarily for linguistic professionals and students". This is a fair warning: the chapter on Sumerian, the first language treated, will make for exceptionally difficult reading (one might almost speak of decipherment) - yet, as a rule, professionals and students of linguistics will seldom find themselves at much of an advantage over a curiosity-driven layman.

The reason is twofold. First, each chapter has been written independently, in its author's own terminology. Not only has one to adapt to a different set of technical terms with each new language, but there is no glossary, so that, all too often, one is reduced to guessing. Even notational conventions vary from author to author and are often used without explanation or definition, or explained only several pages after they are first introduced. Second, there are typesetting errors, a clear example of which is the doubling of several paragraph titles in the chapter on Ancient Chinese. As a consequence, one is never absolutely sure if some puzzling example, some strange construction, some incomprehensible point of grammar, is due to a lack of proper explanation or merely a printing mistake.

Consulting the encyclopaedia thus becomes a major effort. One is forced to keep notes and to turn to them almost constantly.

Space does not allow for the discussion of even a tiny sample of the languages in the book in any detail. So let us turn to Sumerian, given its antiquity and importance, and see what a layman or a student of linguistics stands to make of the chapter.

The oldest Sumerian texts, writes Piotr Michalowski, the author, date from about 3200BC. Although many thousands of clay tablets have been recovered, little appears to be known for sure, even to the point, writes Michalowski, that one text "has been considered by some to be a narrative literary composition; others think it is a word list". By about 1800BC, Sumerian had become a dead language but was still used in schooling - much like Latin in our times.

You stumble across your first puzzle in the introduction: "One of the characteristic peculiarities of Early Dynastic literature is the existence of a separate manner of writing that has been termed UD.GAL.NUN (UGN), from a sequence of graphemes commonly found in these texts... UD corresponds to the classifier dingir "god, divine name", GAL to en, and NUN to lil2. These three signs therefore spell out the name of the chief god of Sumer, Enlil or Ellil, written as den-lil2." What is a classifier? Why those capital letters, why those superscripts and subscripts? The necessary explanations arrive only five pages later, with the description of the writing system.

Until then, layman and linguist alike are left to their own devices to figure out what the notation may precisely mean. Unless, of course, they have already dabbled in Akkadian and are therefore somewhat familiar with the Assyriologists' terminology and notational conventions. But those explanations will avail them nothing when they move on to the chapter on Ancient Egyptian, for its writing system is described afresh, using a different, idiosyncratic terminology, explained in particularly obscure terms.

Puzzlement turns to uncertainty with this sequence of syllabic signs in an example: du-ga-na-ab-ze-en . Remembering the rules of the writing system, one expects that to be, phonetically, something like duganabzen. But it is given as dugababenzen . That does not seem to make sense. It can only be a typographical error. But is it really?

Perhaps not: another example, of a surprising point of Sumerian grammar, reads, phonetically, sa'a dumu lugal-ak-ak , literally "cat son king-of-of" (that is, "the cat of the son of the king" - some languages of New Guinea behave like this too). But the corresponding signs read sa-a dumu lugal-la-ka. Another typographical mistake? No. Three pages earlier, under the subhead "Apocope", a brief paragraph (three lines) gives the solution: final consonants were often dropped in writing, probably also in pronunciation, so lugal-la-ka must have been pronounced " lugal-ak-a(k) ", and all is well after all.

With this other example, however, a proverb, dur gudi na-b-ta-sasa ("you should not buy a braying ass"). I remain incapable of understanding the use of this ablative (- ta -) where I expected an absolutive. Wading through seas of jargon and uncertainty in search of the verbal prefix na - brought only a nagging suspicion that another possible translation might well be "do buy braying asses".

Not only is the chapter on Sumerian often obscure to the point of unintelligibility, it is occasionally misleading, such as this explanation of how the imperative is formed: "Copying the root to the front of the verbal form, which is always the perfect singular root, creates imperatives." In fact, the single example given shows that the verbal root, which normally is the last morpheme of the verbal complex, becomes its initial morpheme in the imperative: it is not copied, it is moved.

So, unless you are content with just skimming the Sumerian chapter - in which case you will emerge in a state of utter confusion - expect it to make for painful reading. If, however, you do manage to wade your way through, your unsatisfied curiosity may well prod you into learning more about the language (why is John Hayes' grammar of Sumerian missing from the bibliography?).

If those 40 pages on Sumerian require such careful, critical reading, always with an eye out for inconsistencies, and even for plain errors such as the formation of the imperative, what can be trusted in the chapters on the other languages, and who but specialists in each and every subject can profit from them? Only an Areopagus of experts could tell.

In particular, what to make of the last language treated, Epi-Olmec? This was reconstructed by the authors Terrence Kaufman and John Justeson by comparing modern Central American (Mije-Sokean) languages. It is no more attested in writing than Proto-Indo-European is. What is attested is inscriptions in an unidentified language, which has been called Isthmian, from its geographical location around the Isthmus of Tehuantepec in Mexico.

Isthmian writing looks strikingly similar to Mayan until you try to read it, at which point it makes no sense whatsoever, much like the traditional Vietnamese characters, which are the spitting image of Chinese characters but make no sense when read as Chinese.

Isthmian is represented on two inscriptions, one about 550 signs long (the La Mojarra stela), the other some 90 signs long (the Tuxtla statuette).

That is at most 700 signs in all, not even three times the length of the solitary Phaistos disc of Crete (241 signs), dated to about 1700BC, which has remained undeciphered precisely because it is too short to allow for a decipherment. Rather, such short texts allow for multiple contradictory decipherments, none demonstrably more valid than the others.

This chapter on Epi-Olmec kept reminding me of Jean Faucounau's Le Déchiffrement du Disque de Phaïstos . Faucounau reconstructs an archaic dialect of Greek, Proto-Ionian, armed with which he deciphers the Phaistos disc. He then uses this decipherment to justify his reconstruction. Kaufman and Justeson's approach is no different. They even resort to glottochronology, long discredited among comparative linguists, to prop up their dating of Epi-Olmec. Knowing the time and effort that it took to crack the Mayan code, and remembering how little is still understood of Sumerian despite the thousands of tablets found, the Kaufman/Justeson claim that "(the) script was deciphered by the authors in joint work, conducted largely from 1991 to 1994" needs, to be believed, more than that "the essentials of the decipherment as it relates to Mije-Sokean linguistic structure are accepted by the leading authorities who do know the evidence (Grube, Kelley, Lounsbury, Mathews, Schele, Urcid)".

As for those who remain sceptical, such as the Mayanists Stephen Houston and Michael Coe, they are mentioned only anonymously: "Some professional epigraphers who do not know our evidence have expressed doubt about the reliability of the decipherment." Kaufman and Justeson may turn out to be right some day - if many more Isthmian inscriptions are found, or a bilingual text. But again, so may Faucounau with the Phaistos disc, or any among the scores of hopeful decipherers who have tackled it. Such a speculative article is out of place in the body of the encyclopedia; Isthmian belongs in the introduction, in the section on undeciphered languages.

Is this to imply that the encyclopaedia is worthless? Certainly not. But it is to say that nothing in it can be fully trusted - which is all the more regrettable since the book contains much food for thought.

Reading it, one comes to realise the importance of comparative linguistics, synchronic and diachronic, in reconstructing and understanding ancient languages. At the same time, one is struck by the number of isolates, languages unrelated to any other, extinct or modern: Sumerian, Elamite, Proto-Elamite (unrelated to Elamite despite its name), Hurrian, Etruscan and so on. If there are so many isolates, is there not a distinct possibility that some still undeciphered ancient writings might also encode languages unrelated to any known ones, and thus be ultimately unknowable?

One is also astonished at how inept many ancient writing systems were at representing the spoken language. Just think of Linear B, used to write an archaic Greek, which spelt anthropos as "a-to-ro-po" and spermon as "pe-mo". Or think of the earliest Sumerian texts, which, it seems, left out all the grammatical inflections - or do they represent another, earlier, unknown language?

This book prompts more questions than it answers, another cause for irritation. For instance, when you read that the Chinese word for tiger "might have been borrowed from an Austronesian language in prehistoric times". Which Austronesian language, what gloss, which prehistoric times: Chinese, Javanese or other? Would it be too much to expect to find that information, rather than be referred to "Norman 1988: 17-20"?

All in all, this encyclopedia is a deeply interesting, even fascinating, work, alas marred by many uncertainties, many idiosyncrasies and many headaches for those who try to read it carefully.

Jacques B. M. Guy is a computer scientist interested in natural language understanding. He holds a PhD in linguistics from the Australian National University.

