The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives

Steve Fuller on Deirdre McCloskey's ongoing struggle against economists' fetishisation of data

April 3, 2008

Nearly a quarter of a century ago, Deirdre (then called Donald) McCloskey published the breakthrough book The Rhetoric of Economics, which began her uphill struggle to reclaim the mantle of "moral science" for economics.

Blessed with a humanist sensibility and accessible prose style, yet trained in the rigours of Chicago School economics, she showed in meticulous detail how even hard-boiled positivists such as Milton Friedman and Gary Becker used rhetoric and made ethical assumptions to make up for a lack of strict deductive validity in their arguments.

Over the years, McCloskey has sharpened her attacks on fellow economists, though interestingly her libertarian politics remain not so very different from theirs. Rather, she aims at her colleagues' misplaced rigour, which makes economics at best irrelevant and at worst inhumane.

In 1980, McCloskey left Chicago for the University of Iowa, where she helped to found the Project on the Rhetoric of Inquiry. This book sees her teaming up with one of her students from that period, Steve Ziliak. They mount a devastatingly comprehensive critique of the mindless reliance on statistical significance testing in economics, medicine, experimental psychology and quantitative social policy research.

The thesis of this book is easy to state and illustrate: statistical significance is routinely confused with policy significance.

Suppose we have two diet pills, and you want to lose 5lb. One pill promises you that you'll lose 20lb, give or take 14, while the other promises that you'll lose 5lb, give or take 0.5lb. While it's true that the first pill's outcomes are more erratic, they are also more effective. You would take the second pill only if you wanted your weight loss to deviate as little as possible from exactly 5lb.

Perhaps some people think about weight loss this way. If so, they happily conflate statistical and policy significance. However, that is probably not how most researchers, or their clients, think about the matter. They are seeking effectiveness - what the authors call "oomph" - but they are settling instead for mere precision under the guise of "statistical significance".

The book recounts the alarmingly diverse instances of researchers and clients misguidedly settling for statistical significance. The book's villain is Ronald Fisher, the most influential statistician of the 20th century. He introduced statistical significance as the gold standard of hypothesis testing.

Fisher was interested in the likelihood of an experimental outcome, given a particular hypothesis, which was tested by seeing whether the outcome would have been the same even if the hypothesis were false (that is, the "null hypothesis"). This strategy conformed to Fisher's idea of science as pure inquiry, in which matters of utility and cost do not figure.

To be sure, there were dissenters, mostly Bayesians, to Fisher's rarefied yet influential view of science. Most heroic of them all was William Sealy Gosset, a student of Karl Pearson who pioneered the treatment of hypothesis-testing as a species of decision-making.

As director of the brewing labs at Guinness in the early 20th century, Gosset introduced the "loss function", which given the available evidence calculates the cost of accepting a hypothesis vis-a-vis potentially better hypotheses that might require further testing.

For Ziliak and McCloskey, Gosset provided a rigorous take on what "significance" means in policy-relevant research. While the book documents the difficulties the authors have faced in persuading their fellow economists of Gosset's virtues, it contains many pedagogical lessons worthy of any social research methods course.