Made to measure

Published on

September 25, 1998

Last updated

May 22, 2015

League tables are here to stay like it or not. Leslie Wagner kicks off this year's THES Trends with a call for more value-added appraisalof university performance

Value-added is the buzz word in education. In schools it can be measured by comparing pupils' test scores at key stages in the national curriculum.

In higher education this is seen as a non-starter. Whatever the deficiencies of school value-added scores, they are at least based on a national curriculum with nationally validated tests. While A levels may approximate to a national test as the starting point, in higher education the curriculum and the assessment process is not nationalised and is in the hands of each university. Indeed, some see the risk of facilitating a national curriculum for higher education as a good reason for opposing attempts to measure value-added.

So are we to be denied a measure of the value added by different universities? I think not. The pressure on institutions to be more accountable for their performance is mounting and many institutions see value-added as the most important element of that. Relying on assertion alone will no longer do. There has to be a measure, however frail. The school league tables experience indicates that the publication of crude tables in the first instance is the spur for the detailed work necessary to make them more sophisticated and fair.

The reluctance of higher education to contemplate a value-added measure, far from preventing the introduction of a national curriculum, may instead hasten its development. The work of the Quality Assurance Agency on benchmark standards could be adapted for these purposes. If policymakers believe that value-added is important enough, fears about a national curriculum will not stand in their way.

So how might an albeit crude measure of value-added be constructed? The approach suggested here is to measure the extent to which each university's performance deviates from the historical pattern of the relationship between entry standard and final degree performance.

Let us start with A-level performance as it is still the dominant pre-higher education qualification, particularly for full-time students. Over the past five years, more than one million students who entered with A levels have left higher education, almost all of them with a classified degree. This provides a rich database to create historical national patterns. We could map the outcome for each A-level points score. So, for example, we might find that the pattern for those who entered with 20 points was, in percentage terms:

First 7

Upper-second 35

Lower-second 45

Third or above 5

Not completed 8

In all we might have 30 different ranges of scores corresponding to each A-level points score. If this was thought too many, the points could be grouped into threes to provide ten different ranges. An individual university's performance for each A-level point or group of points could be compared with the historical pattern. So, using this example, the University of Puddleshire might find that those who graduated in 1997/98 who entered with 20 A-level points performed as follows.

First 8

Upper-second 37

Lower-second 45

Third or above 4

Not completed 6

On balance it looks as if Puddleshire has performed better than the historical norm for those with 20 points at A level. How might this be given a value-added score?

One answer is to give a numerical value to each score. This is a matter of judgement and will no doubt be the subject of fierce debate. But the judgement is not about the relationship between A levels and degree classifications but rather what weight to give to the different classifications of degree. For example, it might be simply linear with four points for a first, three for an upper-second down to zero for a non-completion. Others might argue that a first is qualitatively different from the rest and should be awarded five points with three for an upper-second and so on.

Taking this more elitist view of the value of a first, let us apply it to our percentage scores. The historical average score in the table above is 2.35. The University of Puddleshire's score is 2.45, a percentage increase of 4.3. So for those students entering with 20 A-level points Puddleshire added 4.3 per cent of value by virtue of achieving a range of outcomes above the historical norm. A similar calculation can be made for all the range of A-level scores for Puddleshire and for all other universities and the results compared.

This methodology does not postulate any artificial or theoretical relationship between A levels and degree classifications. It takes the empirically observed historical pattern and compares an individual university's performance with that pattern. A university that performs better than the historical average adds value while one that does not, subtracts value. If the notion of "subtracting value" is too sensitive, an alternative is to assume that all activity "adds value" and the measure then becomes the extent to which such value is above or below the norm.

This is closest to what students would interpret as value-added. A student with 20 points at A level wants to know whether they are likely to perform better or worse than the average at a particular university.

A number of objections is likely to be raised. Final degree classifications should not be used as a measure of output, it will be argued, because they are not standardised between universities.

A first from Puddleshire is not the same as one from (say) Bristol. In which case, one might question the point of the external examiner system.

As Dearing pointed out, this system needs to be improved and these improvements are now in hand.

At least we have such a system, which is more than many other countries. Universities have accepted the use of degree classifications in The Times and THES tables for some years now. It is a bit late to argue that they are inadequate as a comparative measure of educational output.

A second objection is the allocation of points to different degree classifications. This is a matter of experimentation and judgement and there is room for differences of opinion.

If in the example above a linear scoring system had been adopted with only four points for a first, the percentage value-added by Puddleshire would have been 3.9 per cent instead of 4.3 per cent.

Another problem is significant variations in degree classifications between subjects. Thus the value added by a university may be underestimated because it does not teach those subjects that traditionally award a higher percentage of firsts.

However, there is nothing in the methodology that prevents the analysis being undertaken by subject in each university. A weighting factor could then be applied in calculating the aggregate score for each university.

A subject analysis would be of even greater value for potential students. However, there might be insufficient observations in any one year in each university to make statistically significant comparisons at each A-level point.

This is where the grouping of points to provide ranges might be more appropriate.

There is also the objection that many students, particularly at the post-1992 universities, enter with different qualifications and study part-time.

Obviously this is the case but we have to start somewhere. In any event the methodology advocated here can be applied to any set of entry qualifications provided there are enough observations to create a historical pattern and enough students in any given year in an individual university to provide a valid comparator.

The methodology set out here has flaws, as do all measures of higher education activity. But it is a start and it is capable of being built on. We must be careful, in looking always for perfect measures, not to make the best the enemy of the good.

Made to measure

Register to continue

Subscribe

Sponsored

Featured jobs