Two academics have developed software to detect plagiarised work.
Two forensic linguistic experts have pooled their expertise and come up with a detection system that, it is claimed, will expose even the most sophisticated copyist. Computer consultant David Woolls, of CFL Software Development in Birmingham, said the problem of downloading essays from the web was notoriously difficult to spot unless more than one student used the same text.
Lecturers often had their suspicions aroused by an unusual style of writing, he said, but proving any misdemeanour was another matter. "Even double marking isn't going to solve this problem," Mr Woolls said.
Mr Woolls met Malcolm Coulthard, professor of English language and linguistics at Birmingham University, when the pair were working as specialists for lawyers. They have developed a system that compares texts and detects unusual similarities using five different levels of analysis.
The software distinguishes between function words such as "and" and "but" and lexical words that are the key to detection. Even when a plagiarist has attempted to disguise copied work, lexical words will tend to be repeated more frequently. Normally, fewer than half the lexical words would be the same in any two texts on the same subject by different authors. "If 60 per cent of lexical words are the same, then suspicions are aroused," Mr Woolls said. "If 80 per cent are the same, we can be certain copying has taken place."
The software can also compare texts in other ways, focusing, for instance, on spelling or punctuation, or patterns of writing that may differ from the author's usual style. But how accurate is it?
In a recent blind test, Mr Woolls said, 32 essays were written by eight people on 12 different topics. Four essays were copies, and on at least one of the tests the system was 100 per cent accurate. "This is only ever going to be an opinion, but it can be very accurate and may at the very least confirm a hunch," Mr Woolls said.