Pandemic response shines spotlight on coding in science

Debate about computer programs underlying epidemiological modelling leads to wider calls for more openness

Published on

June 16, 2020

Last updated

June 22, 2020

Visitors pass a giant Videoscreen with moving letters symbolising security codes at the computer- fair Cebit in Hanover

Source: Reuters

The Covid-19 pandemic has brought many scientific issues to wide public attention, but even in these extraordinary times, the way computer coding is used in research is not a topic many would have predicted for mainstream discourse.

Nonetheless, the subject has burst into the open, mainly because of scrutiny of the code used in epidemiological modelling − in particular, the highly influential Imperial College London paper, led by Neil Ferguson, published just as the UK started going into lockdown.

The code underlying the modelling came in for criticism after it was posted to the public repository for programming, GitHub, although, according to a report in Nature this month, scientists who have tested the code have found that its results can be reproduced.

Bill Mitchell, director of policy at the British Computer Society (BCS), said that although it agreed that there was “no credible evidence” of major problems with the Imperial code, the episode had shone a light on the issue of how programming was performed and reviewed in academia.

The BCS released a position paper last month in which it said “the quality of the software implementations of scientific models appear to rely too much on the individual coding practices of the scientists” and called for professional software engineering standards to be used where scientific code formed the basis of policy.

Dr Mitchell, a former computing lecturer at the universities of Manchester and Surrey, said there were “lots of very, very standard things that you would expect in the software world” that are not always being done in science.

This included code being readily shared on public repositories such as GitHub; being written in such a way that it can be easily understood and tested by others; and tests being published so reviewers can easily try to replicate the results.

“It goes to the heart of doing science. You tell people what experiments you’ve done; you allow them to look at your working,” he said.

Dr Mitchell said his “very personal” view was that scientists might sometimes view coding as just a “mechanical way of generating data” and might not fully appreciate “just how much innovation and ingenuity and cleverness is embedded in their own code and how valuable that is to other people”.

Changing this culture − especially given the “intense” publish or perish pressures in academia − might require incentives similar to those seen in the open access movement, he said.

The “simplest thing” would be to say that all scientific software developed with public money must be made openly available. “I think suddenly when people realise that, ‘Oh my gosh, people are going to be looking at my code’, the standard will instantly improve,” Dr Mitchell said.

Others say the direction of travel is moving towards more openness, but there was a debate to be had about how to speed up progress.

“In my field, there has been a movement towards transparency for quite a number of years, and it is becoming more and more common for journals, reviewers and the community to require code to be made available with papers,” said Rosalind Eggo, assistant professor in infectious disease modelling at the London School of Hygiene and Tropical Medicine.

She added that one longer-term solution would be to invest more in employing research software engineers “who are experts in writing and translating scientific code and making it more efficient, shareable and, ultimately, more useful”.

“Making sure we have the resources that allow the hiring and long-term funding of software specialists would improve the quality of scientific code and hopefully make it easier to build efficient analysis, and to reuse and repurpose code,” she said.

Konrad Hinsen, a biophysicist at France’s National Centre for Scientific Research (CNRS) and an expert in scientific computing who often blogs on the issue, suggested that employing more research software engineers was a good idea.

However, he added, using them to help write code might be difficult for “small, exploratory projects that are done in informal collaborations”.

“You can’t just add a software expert with a very different working style to such a team. But you can still do after-the-fact code review before accepting results for publication,” he said.

This is where research software engineers could have a key role more generally, including through the traditional publishing process, he said, pointing out that some “pioneering journals” were already including code review as an “integral part” of the peer review process.

More broadly, Dr Hinsen added, the issue was one of “training enough people, and then employing them in appropriate jobs”. However, he was somewhat sceptical about whether progress could be sped up across all disciplines in science.

“Much scientific code is long-lived, and habits are even more subject to inertia. Faster improvement is not possible for scientific code in general, though it is in specific, well-defined subjects where motivation is high. Epidemiology might be in that situation right now,” he said.

simon.baker@timeshighereducation.com

Read more about

Read more about:

POSTSCRIPT:

Print headline: Pandemic models spark calls to reveal more code

Register to continue

Why register?

Registration is free and only takes a moment
Once registered, you can read 3 articles a month
Sign up for our newsletter

Subscribe

Or subscribe for unlimited access to:

Unlimited access to news, views, insights & reviews
Digital editions
Digital access to THE’s university and college rankings analysis

Please or to read this article.

Related articles

Coronavirus

Do we need more coronavirologists in the pandemic debate?

Virologist Christian Drosten has become a high-profile commentator on the pandemic in Germany, but are other disciplines stealing the limelight elsewhere?

11 May

Imperial College London

Imperial Covid-19 researchers criticise plans to cut 75 ICT jobs

Members of the Imperial College Covid-19 Response Team warn against redundancies in ‘vital’ jobs

9 June

History: the key to decoding big data

The academic discipline is invaluable in detecting and debunking myths about the past and future, say Jo Guldi and David Armitage

2 October

Reader's comments (3)

#1 Submitted by Neil Beagrie on June 16, 2020 - 9:29am

Somewhat surprised to see no mention of Software Carpentry https://software-carpentry.org in this article.

#2 Submitted by ... on June 16, 2020 - 11:45am

The article glances over the fact that the attack of Prof Ferguson's coding style was highly suspicious in its motivations, as illustrated by a 'well-known' blog, which I will not dignify with a reference, whose conclusion was to 'defund all epidemiology research'. It seems that some commentators have just discovered that past 'scientific' software was mostly about coding numerical recipes and their algorithmic content may have been much less sophisticated than, say, a GNU Prolog compiler. Disclaimer: obviously my username gives away which style of programming I adhere to, although this bears no relevance to this comment.

#3 Submitted by a_georgoulas_ucl_ac_uk on June 17, 2020 - 10:56am

Worth mentioning that these concerns are known and have given rise to, among others, the Software Sustainability Institute and the Society of Research Software Engineering.

Sponsored