I was interested in the study into inconsistent marking, in particular that the examiners awarded “hugely inconsistent marks” and that “a large element of unreliability” would always remain in assessment (“Low marks on an essay? Try a different examiner”, News, 30 April). Both the findings and the conclusion are unsurprising as the essay format is one of the least accurate formal assessment methods because of examiner variance. The process of moderation simply adds one unreliable practice on top of another, and Sue Bloxham is correct in saying that we should stop using it.
However, the wider truth is that many assessments in higher education are simply not fit for purpose. Among the common problems are lack of proper training for examiners, poorly designed exams and inadequate statistical evidence about the performance of the examination or its component parts. Standards such as pass marks and degree grade boundaries are seldom set using a proper method. Instead, they are often predetermined and set out in regulations without regard to the content or difficulty of the questions, which is ludicrous.
Undergraduate and postgraduate medical exams tend to be reliable and valid because of the requirements of the General Medical Council. Even in medicine, however, examiners can encounter problems, as Kevin Fong described a few weeks ago (“Campus Hunger Games”, Opinion, 5 March). He identified questions that were considered too easy, and the fact that candidates can get the correct answer to a multiple choice question through a blind guess.
These two articles illustrate an important but often ignored principle of examinations. There are three different kinds of expertise necessary to design and quality-assure examinations. The first is subject expertise. The second is expertise in examination design, which includes expertise in assessment theory, methodology, number of items required, testing time and standard setting. The third is expertise in psychometrics, which includes the measurement and interpretation of statistics such as reliability, item difficulty and ability to discriminate between passing and failing candidates.
Many examiners in higher education might not have much more than subject expertise, and until this changes, unreliable and invalid examinations will probably continue.
Consultant in medical and dental education