Peer review is a sacrosanct tenet of scientific endeavour and, like most holy relics, it is passionately defended despite being based on scant evidence. This was illustrated to me earlier this year when I sat on the International Peer Review Expert Panel for the Canadian Institutes of Health Research (CIHR).
The background to the establishment of this panel, which was charged with reviewing the design and adjudication processes of the CIHR’s investigator-initiated programmes, was a sad tale of well-intentioned reforms undermined by implementation failures, funding constraints and political intervention, resulting in a breakdown in trust, civility and solidarity between researchers and the CIHR. Such was the poisonous atmosphere that the story had even spilled over into the mainstream media.
The CIHR had a complex landscape of funding programmes and committees. It wanted to consolidate its old operating grants programme into two schemes – project grants and longer-term, investigator-focused foundation grants. Supporting these new schemes was a streamlined peer review process. On paper, these reforms made sense to the panel, but their initial implementation had failed. The key problem seemed to be with a malfunctioning algorithm that automatically matched grant applicants with potential peer reviewers. However, the issue that researchers campaigned around was the use of virtual – rather than face-to-face – peer review panels. An open letter to the Canadian minister of health, signed by 1,300 researchers, claimed that virtual review “removed the peer pressure for reviewer performance”.
But as pointed out in the panel’s report, this assertion is not supported by the evidence. Commissioned by the CIHR on behalf of the panel, RAND Europe found that only two studies have evaluated virtual peer review: one using teleconferencing and the other using the Second Life virtual world. The former study, published in 2015, set up one video conference and three face-to-face panels modelled on the US National Institutes of Health’s review procedures. It concluded that, despite participants’ preference for face-to-face arrangements, scoring was similar between both types of panel. The latter study, from 2013, examined two years of face-to-face and teleconferenced peer review discussions. It also found minimal differences in score distributions, levels of agreement among assessors and reviewer demographics.
In fact, the limited evidence suggests that face-to-face review is subject to biases based on individual characteristics, such as gender. Decision-making can also become conservative and subject to group dynamics, with just the few individuals considered the most “competent” on a particular topic leading the evaluation. What we don’t know is whether these flaws would be replicated in virtual panels.
The point is that peer review is a subjective process and this matters more when success rates are low (CIHR rates were between 10 and 15 per cent). Typically, the distribution of scores for research funding will follow an “S” curve. There will be a flat line of low-scored, unfunded applications, then an inflection point and a steep gradient to another inflection point, beyond which there is another flat line of high-scored, fundable applications. The ideal system involves peer review differentiating at the second inflection point, but low success rates mean that all of the funding candidates are on the flat top of the S. That leaves the peer reviewers with the impossible task of meaningfully differentiating between them.
The rational, but controversial, thing to do at that stage is to leave it to Lady Luck: that is, decide what is in the top, say, 20 or 30 per cent of fundable grants, and then use a lottery to decide which 10 or 15 per cent should be funded.
Innovation in peer review is essential, but before embracing reforms such as lotteries or open peer review, we need to gather much better evidence about peer review’s consistency, known and unconscious biases and sensitivity to technology. We also need to think harder about its cost relative to the amount of funding being distributed.
In other words, we need to approach peer review as scientists. It cannot be acceptable that we fail to apply the same standards of evidence and rigour to the way that we administer and manage research as we do to conducting the research itself.