Research intelligence: how to sniff out errors and fraud

A growing number of data detectives are on the hunt for sloppy science and dodgy statistics. Jack Grove examines the methods they use

January 23, 2020
Source: Alamy

These days it is not just co-authors or peer reviewers who are checking journal papers for errors: a growing number of self-appointed fraud busters are scanning scientific literature for flaws.

This unpaid and mostly anonymous endeavour has led to the retractions of hundreds of papers and even disciplinary action where wrongdoing is exposed.

So how can scholars catch errors when reviewing others’ papers, or when double-checking their own work or that of collaborators?

One obvious giveaway that something may be amiss is that a paper’s dataset does not include enough zeroes or ones, said David Sanders, associate professor in Purdue University’s department of biological sciences, who has received international attention for calling out scientific malpractice. “Studies have shown that the numbers 0 and 1 are over-represented in real research datasets [beyond the first digit], so if this distribution is not apparent, then that could be a sign,” Dr Sanders told Times Higher Education, citing what is known as Benford's Law.

He also advises would-be sleuths to consider the issue of “p-hacking”, a term coined in Nature in 2014, which describes unconscious or conscious efforts to manipulate data to produce a desired probability value (p-value). Because a p-value above 0.05 generally means the experiment did not generate a statistically significant result, and therefore that a hypothesis should be rejected, researchers may seek to investigate the correlation of variables to generate a result just below this threshold. Borderline results could indicate that some manipulation has occurred, said Dr Sanders.

“Some people are not aware that they are doing this, but there are others who know it’s wrong but do it anyway,” he added.

These basic techniques, however, should not hide the fact that errors can often be difficult to pick up, even within scientific teams, said Dr Sanders.

“In my lifetime, we moved from an average of three authors on a life sciences paper to about six – with people becoming experts in a distinct area, but not expert in other parts of an experiment,” he continued. “That means people get away with things that they wouldn’t if it was just one scientist and their assistant."

While some errors or manipulation are tricky to spot, others are easier, even when full datasets are not provided alongside papers, some argue.

“You don’t need a degree in statistics to catch most of these errors – common sense and simple arithmetic are often all that’s required,” writes Kristin Sainani, who teaches statistics as an associate professor at Stanford University, in a recent paper published in PM&R, the scientific journal of the American Academy of Physical Medicine and Rehabilitation.

In one paper analysed by Dr Sainani, the “implausibly large effect” claimed by the author suggested that something was wrong, she says. The meta-analysis, which made huge claims for the effect of a non-surgical treatment for knee pain, was later revealed to have made a basic statistical error when transposing the data.

Looking for “statistical and numerical inconsistencies” can also reveal larger problems within the dataset, adds Dr Sainani. She points to the Granularity-Related Inconsistent Means (GRIM) test developed by James Heathers and Nicholas Brown as a fairly easy way to check if something is amiss. Their paper defined this test as evaluating “whether the reported means of integer data…are consistent with the given sample size and number of items”.

She also advises using “easy-to-use, online web applications designed to detect statistical inconsistencies in papers, such as Statcheck and GRIM”, adding that “further inspection is needed to determine the source of the inconsistency” once identified.

Running such checks can be tricky when authors fail to provide access to their raw data, but it is possible to gain some extra data from plots or images, writes Dr Sainani, who recommends the free online tool WebPlotDigitizer, which extracts precise values from scatter graphs or bar charts to reverse-engineer a downloadable dataset that allows reanalysis.

Such methods might help academics pick up the most egregious unintentional errors, but premeditated wrongdoing could be harder to crack. “Intentional errors are typically designed to be hard to detect,” write Line Edslev Andersen and K. Brad Wray, from Denmark’s Aarhus University, in a recent paper in the journal Social Studies of Science, which analysed the reasons for 92 retractions from Science over the past 35 years.

“Intentional errors are likely to consist [of] misrepresentations that are hard to uncover by studying the raw data [and] can often only be discovered by scientists who did not participate in the research reproducing the experiments,” they conclude.

But having “authorship policies that to some extent require the different parts of the research to be checked by more than one member of team” would be a useful step in detecting the most serious errors, they add.

jack.grove@timeshighereducation.com

POSTSCRIPT:

Print headline: The art of sniffing out rotten research

Related articles

Have your say

Log in or register to post comments

Most commented

Mary Beard’s recent admission that she is a ‘mug’ who works 100 hours a week caused a Twitter storm. But how hard is it reasonable for academics to work? Who should decide? And should the mugs be obliged to keep quiet? Seven academics have their say

20 February

Sponsored