The front cover, with an image of a bearded teacher bashing his head against a blackboard, encapsulates the author's frustration at the misuse and misunderstanding of academic testing. In this computerised age, to see a blackboard and chalk, not even a whiteboard and marker pen, seems archaic. The image suggests, perhaps not intentionally, the passe nature of academic testing. Nonetheless, as Daniel Koretz points out, in the US "standardized testing has been ubiquitous, just a fact of life".
The aim of the book is to provide a better understanding of educational achievement testing: what the tests show and what they don't show; what are their problems, limitations and value. Koretz maintains that you don't have to be "a psychometrician to understand the key issues raised by achievement testing" or to be an informed user of test information.
The book is about school testing but the principles would apply at any level of academic testing. It is written in an accessible style and concepts are explained without recourse to technical jargon. This makes some of the explanations long-winded for those who know the area but would be ideal for those with an open mind who want to find out more. As such the readership should be general: every parent who uses league tables as a basis for placing his or her child in a school, whether in the US or anywhere else, should read this book. But the readership is more likely to be academic researchers, perhaps policymakers (who will probably ignore the central message) and maybe teachers.
Koretz is clear that educational testing "can in fact give us tremendously valuable information about student achievement". However, the central message is one of caution, and a critical scepticism about standardised testing, not least because limited and often flawed tests have far too high a profile, especially in the US, and distort the educational processes because of the endemic "teaching to test".
In the opening two chapters, Koretz explains the incomplete and indirect nature of testing, and how it necessarily samples knowledge and needs to be discriminating. He considers the difference between absolute (or normative) scoring and relative (comparative) scoring. He explores how aggregate reliability differs from individual variation in test scores, and explains the differences between reliability, accuracy and validity. But bias is not addressed here and gets covered only later in the book.
Koretz asserts that validity is the single most important criterion and that tests are not valid in themselves but that it is the inferences from the tests that are valid or invalid. Reliability, accuracy and test bias, he argues, are "all pieces of the validity puzzle". He claims that, unlike those who promote the notion of "consequential validity", validity is not about the impact of testing. Controversially, he claims, a major problem is test preparation, which undermines any testing regime and leads to score inflation.
The rest of the book has chapters dealing with the issues flagged in the introduction. For example, chapter 10 addresses the issue of score inflation and chapter 11 bias and impact. Koretz argues that test preparation represents meaningful gains only if it generalises to broader real-world performance. Otherwise, inflated test scores provide an illusion of progress and an erroneous judgment about the relative performance of schools. Inflated scores cheat students "who deserve better and more effective schooling". Furthermore, all tests, however well constructed, have an element of (cultural) bias, which, when the data are inferred to imply appropriateness for study in higher education, leads to adverse impact on various class and racial groups and on students with disabilities, a specific issue Koretz addresses separately in chapter 12.
Measuring Up: What Educational Testing Really Tells Us
By Daniel Koretz
Harvard University Press
Published 9 May 2008