In 2014, more than 150 UK higher education institutions submitted nearly 200,000 research outputs and 7,000 impact studies to the research excellence framework (REF), at an estimated total cost of nearly £250 million. Those overall figures are not expected to be reduced this time around, so what do we get for a quarter of a billion pounds? How effective is the REF at assessing quality?
The draft guidance on REF 2021 associates quality with originality, significance and rigour, but its grading criteria remain hazy and subject to variation between units of assessment (UoAs). What counts as “world-leading” originality, for instance? How closely can small panels of multidisciplinary reviewers accurately determine how far an output would need to fall below the “highest standards of excellence” before it is rated 3* instead of 4*?
Then there is the question of sample size. In 2021, institutions must submit an average of 2.5 outputs per academic in a UoA over the seven-year qualification period. Compared with 2014’s requirement of four articles per academic selected for submission, this is an inclusive approach intended to engage a wider proportion of the academic community. However, it is only a selective snapshot of productivity for active researchers and may not fully differentiate between groups. Moreover, such selectivity seems unnecessary when modern electronic systems are able to cope with huge datasets.
Each of the 34 assessment subpanels consists of about 15 experts. Based on 2014 submission figures, each panellist will need to review more than 700 outputs over a few months, assuming each submission is assessed by two people. The impossibility of doing so with the appropriate level of critical insight is exacerbated by the diversity of topics within each UoA, rendering particularly perverse the instruction that panels must disregard journal hierarchies.
A decade ago, a study put the cost of journal peer reviewing at £1.9 billion a year. Although the efficacy of the system is debated, it is a fundamental principle of publication that assessment of papers is undertaken by reviewers selected for their specialist knowledge of the specific topic in question. This is likely to be more rigorous than the REF panels’ generalists are likely to manage. Surely it would be a much better use of taxpayers’ money to drop this duplication and free up the panellists to focus on higher-order evaluations, such as the coherence of work and its impact.
Australia’s REF equivalent, known as Excellence in Research for Australia (ERA), is a case in point. It recently closed its consultation period for compiling the discipline-specific journal rankings on which it largely relies to assess scientific subjects. These rankings do much more than apply a simple journal impact factor: they recognise the prestige of the publication with respect to each area of research, on the understanding that a journal that is highly prestigious in one field may be less so in a neighbouring one.
The rankings make the plausible assumption that if a discipline agrees that a particular journal carries a 4* ranking then most articles published therein will be of that quality. Clearly there is no guarantee of that in all cases but that doesn’t matter at the macro level, particularly if the assessment takes in all outputs published in the relevant period, rather than a REF-style sample.
Apart from being more transparent than the current REF methodology, a fuller desktop evaluation of outputs based on agreed subject-specific publication rankings could be carried out more frequently than every seven years. This would inevitably give a truer insight into each research group’s productivity relative to its quality, and provide a stronger basis for the distribution of research funds.
Andrew Edwards is head of the School of Human and Life Sciences at Canterbury Christ Church University. Tomasina Oh is associate dean of research at Plymouth Marjon University. Florentina Hettinga is reader in the School of Sport, Rehabilitation and Exercise Sciences at the University of Essex. Views expressed are the authors’ own.