Knight of the long knives

Empirical Linguistics

February 15, 2002

With the country captivated by Harry Potter and Lord of the Rings , it is time for another epic adventure. Imagine, if you will, the delightful country "Linguistics", so long under the ruthless sway of the evil wizard Mitnik, alias Noam Chomsky from MIT (aka Massachusetts Institute of Technology). The huge structure he and his slaves have built, which he has cynically called "generative linguistics", is in fact a wicked spell that has bewitched generations of researchers and prevented them from studying language as it really is. But fear not: a bold fellowship of brave knights, among them Geoffrey Sampson, has challenged this sterile sorcery with a greater white magic. Armed with computer corpora (large collections of authentic texts in electronic form) and daring statistical techniques, this noble brotherhood will defeat the sinister ones and restore linguistics to its true path.

Some such Manichean vision underlies this collection of papers, spanning over a quarter of a century. Sampson argues that computer corpora enable us to investigate language in new ways and to base our findings on secure empirical foundations. This is true, and the non-polemical chapters of this book demonstrate it in several places. One of the best is "Many Englishes or one English", in which Sampson asks why technical writing tends to use longer sentences than informal writing such as fiction. Using two samples from a "treebank" (a corpus in which every sentence is analysed into its component parts), he discovered that the only relevant factor is a small difference in the average size of noun phrases in the two types of writing. Other chapters outline some of the statistical techniques Sampson has been using in his recent research, and state the case for a taxonomic scheme that can handle all the data found in real speech and writing.

This is important and fruitful work, and if the book simply expounded its virtues, I would welcome it unreservedly. Unfortunately, Sampson also uses the book to attack approaches to language that do not use computer corpora. He claims that "in recent decades linguistics has not been an empirical science in practice". This is seriously misleading: generative linguists do not ignore empirical evidence: rather, they pick out the evidence that bears on the questions they consider important. What is more, this is true not only of Chomsky but of most dictionaries, practical grammars and language-teaching materials. Computer corpora have made it easier for lexicographers and textbook writers to use authentic material, but the approach is a selective one that puts the needs of language users and language learners first. What Sampson needs to show is that grammatical research can benefit in the same way by using authentic data. Interesting though the book is, it strikingly fails to do this. Nowhere does it engage with any of the issues that grammarians are investigating. What is more, it rarely illustrates what corpus-based research looks like, which is typically to present abundant examples of how a word or phrase is used, thereby revealing patterns of usage that do not show up when only isolated examples are considered.

The attack on Chomsky also gives his work an importance that it does not have. Most research into language has looked at areas such as phonetics, speech and language pathology, sociolinguistics, pragmatics, language teaching and learning, dialectology, the history of languages, machine translation and others that are highly empirical and where Chomsky has had almost no impact. Using "linguistics" to mean "Chomsky's work in grammar", gives a misleading picture of a diverse and lively field in which generativists are ploughing a small corner.

In summary, then, Sampson and his fellow knights are doing useful work, and making new discoveries about language with large computer corpora. The book does not mention any of the weaknesses of corpus-based studies, such as the tendency to concentrate on frequency data at the expense of insight and explanation - but that is forgivable in a text that sets out to promote a new research paradigm. What is less successful is his claim that all of linguistics has been led astray by Chomsky. I take the old-fashioned view that if an investigative framework leads to important and useful results, it will gain adherents, while if it doesn't, it won't. Rubbishing other frameworks is otiose and unproductive. The battle should be for knowledge and insight and against ignorance, not against other scholars.

Raphael Salkie is principal lecturer in language studies, University of Brighton.

Empirical Linguistics

Author - Geoffrey Sampson
ISBN - 0 8264 4883 6
Publisher - Continuum
Price - £50.00
Pages - 226

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Sponsored