Why the numbers don't add up

August 27, 1999

The quality of teaching should not be measured by scores, says Roger Williams

The Quality Assurance Agency will soon present proposals for a new method of assessing the quality of university teaching.

I have argued at board level that the current numerical judgement method is inappropriate, yet it remains on the agenda. Why?

Originally, the approach seems to have recommended itself by analogy with the judgements made of the quality of research in universities. Lord Kelvin said:

"When you can measure what you are speaking about and express it in numbers, you know something about it." But Lord Kelvin's numbers were a scientist's numbers, where two observers measuring the same phenomenon would expect to get the same result. Higher education uses numbers differently, to reflect judgements, not measurements.

The object of the research assessment exercises, for instance, is to assign units of research submitted by universities to one of a small number of grades (seven in 1996), on which rest money and prestige. It is inconceivable that every unit submitted should fit into just one of seven grades: common sense suggests rather a continuous spectrum from the best unit to the worst. To offset this, the assessments are made by expert panels, each responsible for a particular subject.

Given that judgements are being made and the categories confined to seven,one might perhaps claim an accuracy of, say, 95 per cent for the RAE process. By contrast, teaching quality assessments in England result in marks of 1 to 4 in each of six categories. Some commentators have then added the resulting six numbers to give a score out of 24, so universities may be ranked in a league table for each subject.

Compared with the RAE two new types of possible error have been introduced. In TQAs the same panel does not make all the judgements in a particular subject; and error arises when TQA results are added together rather than expressed as a profile - 4, 3, 3, 3, 4, 3, say. When that group of figures becomes 20, meaning is seriously lost because that 20 is quite different from 4, 4, 3, 1, 4, 4.

Now the QAA is advocating a differential touch, whereby universities the agency is concerned about will receive more attention than those where there are grounds for expecting high standards. But if because of this a subject at one institution receives 20 review days, while at another it receives only ten, numerical gradings become completely indefensible. The proper form of reporting must then be narrative precision.

The problem arises that this may appear inadequate to the funding councils, which need to know precisely where teaching should be rewarded with more money or students. The information already possessed by the QAA about the teaching performance of institutions would allow it to respond to most questions funding councils might put without the need for numerical judgements. A diverse sector should have diverse indicators of quality.

It makes no sense to insist on research and teaching assessments and then be cagey about the results, so clearly, dialogue between the funding councils and the QAA must be open. But my aim is to get away from numerical judgements, which are barely justifiable even now.

Roger Williams, vice-chancellor of Reading University, is a member of the QAA board.

How should teaching be judged?

Email us on soapbox@thesis.co.uk

You've reached your article limit.

Register to continue

Registration is free and only takes a moment. Once registered you can read a total of 3 articles each month, plus:

  • Sign up for the editor's highlights
  • Receive World University Rankings news first
  • Get job alerts, shortlist jobs and save job searches
  • Participate in reader discussions and post comments
Register

Have your say

Log in or register to post comments