The news in autumn 2011 that Diederik Stapel, the highly respected Dutch social psychologist, had committed scientific fraud on a huge scale came as a shattering blow to the international community of social psychologists.
A recipient of scientific awards from major associations in Europe and the US, Stapel had enjoyed a meteoric rise within our profession. Awarded the social psychology chair at the University of Groningen in 2000, only three years after earning his doctorate, he then moved to Tilburg University, where he became dean of the School of Social and Behavioural Sciences (and a regular tennis partner of the rector).
As he was famous well beyond his field for publishing papers purporting to show, for example, that the presence of wine glasses improves table manners, that messy environments promote discrimination and that meat eaters are more antisocial than vegetarians, the international media lapped up stories of his spectacular fall from grace. He was suspended and later dismissed by Tilburg.
In his autobiography, Ontsporing (Derailment), published at the end of 2012, Stapel admits to committing fraud from the beginning of his career, moving from minor data falsification to outright fabrication over the years. Last November’s joint report by the investigating committees established by all three of Stapel’s former institutions identified 55 articles as fraudulent, 47 of which have so far been retracted. That puts Stapel into the top tier of serial fraudsters, but still far below the present record holder for fraudulent articles, Japanese anaesthesiologist Yoshitaka Fujii.
The committees identified numerous flaws in Stapel’s research, ranging from poor statistical methods to incorrect and incomplete descriptions of the way a study had been conducted and data had been collected. The report acknowledges that since Stapel’s publications “do not constitute a random sample of social psychological publications”, it “goes without saying that the committees are not suggesting that unsound research practices are commonplace in social psychology”.
Just as well, you might think, since none of the committee members has any background in social psychology. Yet, a few pages later, the report makes an about-turn, saying that a “byproduct” of the committees’ inquiries is the conclusion that “there are certain aspects of the discipline that should be deemed undesirable or even incorrect from the perspectives of academic standards and scientific integrity”.
The Stapel affair has been particularly damaging because it occurred during a precarious period for social psychology. With the exception of cognitive dissonance theory, our discipline has been known mostly for supporting uncontroversial theories. But things changed dramatically towards the end of the 20th century with the rediscovery (the notion had already struck Freud and the behaviourists) that people can be influenced by stimuli in their environment without being aware of it.
Empirical exploration of these phenomena by cognitive social psychologists resulted in numerous counter-intuitive findings. For instance, consumers bought more French than German wines after French music was played in a supermarket. Repeated 20-millisecond exposures to the words “Lipton Ice” increased the frequency with which that brand was preferred to other soft drinks subsequently - even though 20 milliseconds is too short a period to recognise the name consciously. And most startlingly, exposing people to words related to elderly people made them walk more slowly when they left the site where the experiment was conducted.
Social psychologists attribute all such effects to “priming” or increased “cognitive accessibility”: the idea that exposure to an external stimulus brings certain thoughts to the top of people’s minds. Thus, French music is likely to create in some shoppers warm feelings about France; exposure to “Lipton Ice” brings the brand closer to conscious attention; and words related to the elderly trigger the stereotype of old people, and since walking slowly is part of that stereotype, this unconsciously influences people’s walking speed.
Although there are hundreds of studies demonstrating such behaviour- priming effects, such findings were so counter-intuitive that they were met with a great deal of disbelief, particularly among cognitive psychologists. This was ironic because it was they who had originally developed the concept of priming. But whereas these asocial cognitive psychologists studied priming in darkened labs, protected against any possibility of social factors influencing their effects, cognitive social psychologists used their methods and theories to study priming effects on behaviour. And whereas the press often reported the findings of cognitive social psychologists, reporters were less interested in the work of their non-social colleagues. (This division of labour is reminiscent of H.G. Wells’ novel The Time Machine, except that in our world it is the Eloi who feed on the Morlocks.)
Having used priming exclusively to test hypotheses about associative memory, cognitive psychologists could not believe either that priming could have such a pervasive influence on behaviour or that people were not aware of this influence. They therefore searched for alternative ways to explain the findings.
‘Reviewer blindness’ is indeed shocking. Sadly, it is also characteristic of all scientific fields
It is important to note that these doubts concerned the unconscious influence of environmental primes on behaviour, which is only a relatively small subfield of “social priming”. The majority of priming research in social psychology focuses on the impact of primes on social judgements such as traits or stereotypical judgements, and this research has not come under critique. But the doubts resulted in numerous articles (read mostly by other cognitive psychologists) about criteria to decide when subliminal stimuli were really subliminal.
The discussion became more heated when several articles were published by methodologists about “questionable” research practices in psychology. They are “questionable” because they are in a grey zone between proper and improper methodology, ranging from failing to report an outcome measure because it did not show any effect to presenting a biased review of the literature, focusing only on supportive evidence. Use of such methodology is suspected of increasing the likelihood that findings appear to support a hypothesis even though this is not the case (“false positives”).
When looking for evidence of such practices in social psychology, Uri Simonsohn, associate professor at the Wharton School, University of Pennsylvania, came across an article by Lawrence Sanna, at that time a University of Michigan social psychologist, that reported very strong effects from a weak manipulation. More suspiciously, the data did not seem to vary as widely as one would expect.
Simonsohn began to look for such signs in other social psychological research and identified several articles by Dirk Smeesters, a Belgian social psychologist working at Erasmus University in Rotterdam. Both Smeesters and Sanna resigned after the revelations: so far, four papers by the former and seven by the latter have been retracted.
Although Simonsohn had started his search before the Stapel affair, the fact that the two new fraud cases became known within weeks of it further tarnished the image of social psychology. And while those cases were still making headlines last year, another event occurred that brought the focus of the debate back to false positives: a paper reporting the failure of a group from the Free University of Brussels to replicate the elderly walking study, which was originally carried out at New York University by John Bargh (now professor of psychology at Yale University).
Since there have also been several successful replications of the study by Bargh and others, and since occasional failures to replicate work occur in all sciences and often suggest moderating variables, the article would probably have had little impact had Bargh not tried to trash it in a blog posting, calling the Brussels group incompetent and dismissing PLoS One, the respected open-access journal that published the research, as low quality.
We do not know whether this paper and Bargh’s intemperate response were the reasons why Alan Kraut, executive director of the Association for Psychological Science, called a small meeting on replication-related issues in September 2012, but they were certainly why Nobel prize-winning psychologist Daniel Kahneman attended it. It seems that the participants (first among them Hal Pashler, a cognitive psychologist from the University of California, San Diego) used the meeting to launch another attack on studies of the effects of “social priming”. Impressed by the critique, Kahneman went away and wrote a widely misinterpreted open letter to “students of social priming”.
“My reason for writing this letter is that I see a train wreck looming,” he wrote. “I expect the first victims to be young people on the job market. Being associated with a controversial and suspicious field will put them at a severe disadvantage in the competition for positions.”
The combination of powerful imagery with the status of a Nobel laureate ensured that the warning made international news - although journalists ignored the sections of Kahneman’s letter stating that he is not a social psychologist, that he has written a book that relies heavily on priming research, and that he still believes in priming effects. He recommended that to alleviate doubts, a board of senior researchers should nominate five such effects to be independently replicated by different laboratories.
In emails, Kahneman expressed surprise that what he considered a reasonable suggestion had caused uproar among cognitive social psychologists, who felt unfairly targeted. Obviously replicating studies is a good idea, but why should social priming be held to a higher standard? Furthermore, by referring to social priming (or sometimes even just priming), Kahneman overgeneralised a critique that had mainly been directed at behaviour-priming studies. Yet even with behaviour priming, the occasional non-replication is vastly outweighed by a substantial body of supporting evidence.
We have no reason to doubt Kahneman’s good intentions. But he can be criticised for not having recognised that he might have been overly influenced by a small group of psychologists who are hardly representative of the field. Furthermore, he should have foreseen the immeasurable damage his letter would do to social cognition research and to social psychology in general.
We also doubt that his letter will have the consequences he envisaged. It will certainly encourage a flood of attempted replications of behaviour- priming studies. But because the quality of this research will be variable, the results can be expected to be variable, too - providing grist for the mills of supporters and critics. In our opinion, the controversy will not be resolved by replicating effect studies, but rather by further theoretical elaboration of the psychological processes underlying these effects and by empirical testing of those processes.
The Stapel investigation committees justified their indictment of the whole of social psychology with the observation that Stapel’s fraud had not been discovered during peer review. Apart from replication, peer review is supposed to be the main mechanism by which science is assumed to be self-correcting, so this “reviewer blindness” is indeed shocking. Sadly, it is also characteristic of all scientific fields. To use it to attack social psychology in particular blatantly ignores the vast literature on scientific misconduct.
An analysis of a sample of 40 well-documented fraud cases, mostly in biomedicine, published in Perspectives on Psychological Science last year by three social psychologists (including one of the authors of this article), found that hardly any were discovered during peer review. And even though everybody wondered afterwards how the co- authors or the physicists and medical scientists who had reviewed these articles for top journals such as Nature and Science could have missed such glaring inconsistencies, nobody suggested that this indicated their disciplines condoned “sloppy science”.
There are several reasons for such reviewer blindness. Because fraud is relatively rare, its possibility is not generally contemplated. Science is based on trust, and scientists find it difficult even to consider that members of the club might be cheating. The major difference between the Dutch committee members and the reviewers of Stapel’s manuscripts is that the reviewers were assessing articles by a scientist of unblemished reputation whereas the committee members already knew that most of Stapel’s research was fraudulent. There is a rich social psychological literature on biases in human reasoning and decision-making, including both the “hindsight bias” - explaining why people are always cleverer after the fact - and the confirmatory bias in hypothesis testing, whereby researchers seek information that confirms their hypothesis and ignore data that contradict it.
It is also significant that, in most known fraud cases, the perpetrators took great care to predict effects that were highly plausible on the basis of previous research. Also, the frauds were not typically discovered as a result of failed replications. While this can be attributed in part to infrequency due to the difficulty of publishing such replications, the major reason is that single failures to replicate are no indication of misconduct since there are many other reasons why they can happen.
Less obviously, fraudulent studies can be replicated successfully. By staying close to established knowledge in their inventions, fraudsters have at least as much chance (if not more) as honest researchers of coming up with valid hypotheses. As Stapel writes in his autobiography, his invented findings were often replicated: “what seemed logical and was fantasised became true”.
Similar points have been made by the numerous professional organisations that condemned the investigating committees’ unjustified leap from disapproval of Stapel’s misconduct to condemnation of social psychology as a whole. The European Association of Social Psychology called it “unwarranted and unscientific”, while Stephen Gibson, honorary secretary of the British Psychological Society’s social psychology section, warned in Times Higher Education against tarring the whole discipline with the Stapel brush (‘Don’t tar discipline with Stapel brush’ 20 December 2012).
In a rejoinder published this month in The Psychologist magazine, the committee chairmen admit that if they had looked at the literature on fraud in general, they might have realised that these problems were not unique to social psychology. But they add that such a comparison was not part of their remit.
Of course, there are lessons to be learned. Indeed, the Stapel affair has already led to changes in the way in which social psychologists conduct their scientific business. Consensus has been reached that psychology journals will now require the raw data for all published studies to be publicly accessible online. Although this will not eliminate fraud, it will make the work of fraudsters, who in the past had to communicate only descriptive and test statistics, more risky.
And although there has always been agreement that researchers should report how they have dealt with missing values and explain why they have eliminated any participants, the Stapel and Smeesters cases suggest that not everybody follows these rules. There are now safeguards being discussed that would prevent such violations.
Once the dust has settled, social psychology will emerge as an even stronger discipline as a result of these measures. It should not be defamed because of the immoral activities of an unscrupulous but tiny minority.