In response: do REF cycles really encourage 'poorer quality research'?

There is no evidence that the REF process encourages academics to rush out more research of a lower quality, says Steven Hill

January 8, 2018
Quality under magnifying glass

In developing a national research assessment process such as the research excellence framework (REF), a key issue is the potential for unintended negative consequences.

This is why the four higher education funding bodies develop the REF, building on careful evaluation and with extensive consultation. So my attention was grabbed by a piece in Times Higher Education reporting that there was evidence of researchers "rushing out...poor quality research" in response to deadlines imposed by the REF.

The report is based on a new preprint that was posted last month. Looking at the transition point between the publication windows for the research assessment exercise 2008 and REF 2014, the paper claims to provide evidence that more articles are published in the year before the transition, and that those papers are cited less than those in the following year.

This sounds like compelling evidence that the assessment is encouraging undesirable behaviour, but a closer look at the data suggests there are problems with the interpretation.

The first claim is that more articles are published in the year before the deadline. The conclusion is based on the date of publication of articles submitted to RAE 2008 compared to REF 2014, and it is indeed the case that articles published closer to the end of the RAE period are more likely to be submitted. This is not a new observation, and was reported in 2016 in work that Hefce commissioned from Digital Science.

The pattern is discernible across a number of assessment cycles, although there are interesting disciplinary differences, with tendency to submit more recent material disappearing in the science and engineering disciplines over time, but remaining in the social sciences and humanities. Of course, all of this type of analysis comes with a caveat in that it is limited to journal articles, and so only considers a minority of the submitted outputs outside of the sciences.

The reasons for this pattern are not clear, but it is important to remember that the data are for submitted articles, and the data do not reflect total volumes. Elsewhere in the paper, the authors do investigate total volumes, but the evidence is much less compelling.

First, they express the results as the UK share of global output, not as absolute numbers. There is some fluctuation in this share that links to assessment cycles, but this is only apparent for articles published in journals with a low impact factor. Looking elsewhere, there is no evidence of significant shifts in total volume in the various reports that Elsevier have produced for the UK government. There have been changes in the share, but this is largely attributed to increases in production in other countries, notably China and India.

The second claim is that articles published at the end of the assessment window have a lower citation score than those published at the beginning. Again, this conclusion is based on differences in the articles submitted for assessment rather than the total pool. The Elsevier reports referenced suggest a steady increase in the field-weighted citation impact (FWCI) of the total UK output, with no evidence of discontinuities around assessment cycles.

So how to explain the fact that articles from 2007 submitted to RAE 2008 have lower citation scores than those from 2008 submitted to REF 2014?

We know that submission choices are made at the end of the cycle. Decisions on whether or not to include the 2007 articles in RAE 2008 were made in 2007. Because those articles were so new, very limited information about their citation rate was available to influence the submission decision.

In contrast, the decisions on whether to include the 2008 articles in the REF 2014 submission were made in 2013, and, although article level citation scores are a poor proxy for quality, it may be that citation information influenced the submission decisions. As a result, choices about old articles (2008 published, with decision in 2013) are likely to be biased towards those with high citation counts, whereas choices about new articles (2007 published, with decision in 2007) are not.

In the latter case, acknowledging the criticism that citations are a weak proxy for quality when applied to individual publications, you could draw the conclusion that different, and possibly better, judgements about quality are being used. In any event, if citations are used to inform selection it is not surprising that the selected articles have more citations – this is more an artefact than "an unintended negative consequence". Just perhaps it is a positive consequence that articles published late in the assessment cycle are selected using informed academic judgements on quality, rather than citation rates.

So, overall I don't think the conclusions are supported by the data in the paper. While there is evidence that, for some disciplines, recently published articles are more likely to be submitted for assessment, there is no evidence that this equates to increased publication volumes.

The reported citation differences can be explained by factors other than differences in quality. We need to always be vigilant for potential unintended effects of assessment, but this work doesn't provide cause for concern. 

Steven Hill is head of policy (research) at the Higher Education Funding Council for England (Hefce).

You've reached your article limit.

Register to continue

Registration is free and only takes a moment. Once registered you can read a total of 3 articles each month, plus:

  • Sign up for the editor's highlights
  • Receive World University Rankings news first
  • Get job alerts, shortlist jobs and save job searches
  • Participate in reader discussions and post comments
Register

Reader's comments (2)

I think there was an effect of the 4 outputs per academic to be submitable was significant. As the census date came looming then the pressure to have those 4 outputs was extremely strong. This leads to an undoubted attempt to get more things published and thus submitted in the last year before census. As found in my research: https://kclpure.kcl.ac.uk/portal/en/publications/an-analysis-of-the-arts-and-humanities-submitted-research-outputs-to-the-ref2014-with-a-focus-on-academic-books(9cfc5250-07e0-4d82-9b0b-0853447024e6).html Page 26: "In the final year before the census date for the REF2014 an average of over 27% of books were published. The trend shown in Figure 10 is clear with more books published year on year from the relative low of 2008 through to the peak year of 2013. This effect is not confined just to books; with the average across all research output types being ~25% in the final year and increasing year on year from ~10% in the first year. Thus, there is a clear effect of the hard deadline of REF on when books are published. The back loading over the 5 year period with a last year rush will have significant effects upon capacity: for publishers, editors, peer review and academics alike." I believe that the REF2021 rules with an average of 2.5 outputs per academic will address the problem of rushing out research outputs that may not be as high quality or as polished as possible. The emphasis will be on quality not quantity and this will be a stronger indicator of the research strength of the Unit of Assessment.
Papers early in the cycle will have the luxury of being submitted initially to higher ranked journals even if they are not likely to be accepted; some will randomly be selected at a higher rank than warranted. This is the first thing that needs modeling - the choice of outlet. The second need is for a model of citation. If citations are naturally biased towards higher ranked journals, the more cautious behaviour of submission in the later part of the cycle will bias cites downward.

Have your say

Log in or register to post comments