The Royal Statistical Society wants prospective students to have reliable and robust information that can help them make sound decisions about where and what to study.
These are incredibly important life choices, so it is essential that students can rely on the information that is made available to them.
We are not against teaching quality being assessed. We just want it to be done in a way that is scientifically valid, readily comprehensible and suitably cost-effective.
Sadly, the current teaching excellence framework system fails, I believe, on each count.
Since 2016 we have outlined its many failings and continued to do so through contributing to the ongoing independent TEF review, led by Dame Shirley Pearce. But we’ve also taken the unusual step of flagging our concerns to the UK’s statistics regulator and calling on it to assess the statistical basis on which the whole TEF process is based.
We’ll be interested to see the verdict reached, in due course, by the UK Statistics Authority. But I’d be unpleasantly surprised if they find that the TEF complies with the Code of Conduct for Statistics, which requires official figures to be trustworthy, high quality and cost-effective. There are many statistical concerns, and I’ll touch on a few here.
First, there is the issue of diluting a complex provider (or subject group) of higher education into three groups – gold, silver and bronze. A prospective student might choose a TEF silver provider over a TEF bronze, even though the differences between them might be slight.
Indeed, two TEF silver providers could be ‘further apart’ than the difference between one of them and a TEF bronze provider.
Moreover, the RSS believes that it is imperative to have some measure of uncertainty attached to the TEF awards – something that is currently lacking. This would also encourage a proper treatment of uncertainty throughout the TEF. If we did have some uncertainty measures, I suspect that some differing awards between providers would turn out to be statistically indistinguishable.
Second, TEF awards, possibly inadvertently, reinforce the notion of comparability across the sector, whereas the sector’s diversity is both surprising to outsiders and a great strength.
The RSS believes that TEF gold for one provider (or subject group) is not necessarily the same as TEF gold at another. Partly, this is due to the TEF’s benchmark process itself (which tries to compare like with like), but also because TEF cannot control for all unobserved factors. Unobserved factors include things such as course difficulty/challenge/content, but, worse, things that we don’t know about.
The RSS is not the first to raise this: Robert Crouchley and Jim Taylor, from Lancaster University, argued in 2004 that this methodology “legitimises the inappropriate use of a residual as a performance indicator”.
Third, there are many issues with the benchmarking process that underlies the TEF. This is a technical procedure that results in metrics for providers (or subject groups) being flagged as being out of line with their “demographically similar” comparators.
For example, the Council for the Defence of British Universities’ submission (3) to the independent TEF review has discovered that flagging seems to be related to the size of the provider.
The RSS has raised the point that the benchmarking process is using statistical critical values obtained from the single hypothesis test situation, instead of using methods of multiple hypothesis test control; see the ‘look-elsewhere’ phenomenon if you’re interested in learning more.
In our view, it’s a clear statistical error, which also contributes to too many flags being raised on metrics that are then fed to TEF panels. This casts serious doubt on the credibility of all the TEF awards made so far.
The RSS’s submission also highlights our concerns with other statistical issues, such as how to handle small sample sizes, missing and non-reportable data, interdisciplinary course assessment and, fundamentally, whether the TEF manages to live up to its very name by actually assessing teaching excellence.
Such unease has led the RSS to ask for a TEF ruling from the UK Statistics Authority.
While we accept that TEF awards are not officially “national statistics”, they are produced by the government through a complex statistical process and the results are given enormous weight by both prospective students and many universities’ marketing and recruitment teams.
They should either be trustworthy, high-quality and cost-effective or, we believe, they are not worth producing at all.
Guy Nason is professor of statistics at the University of Bristol and vice-president for academic affairs at the Royal Statistical Society.