Leader: A case of considered judgment

September 18, 2008

Here's a shock: academics know how to assess work and reach mostly the same conclusions about what mark is merited

When we decided to conduct our Big Marking Experiment, we set out with only one objective in mind: to find out if there really was huge variation between markers and between universities - as those calling for an overhaul of the assessment system would have us believe - or whether academics were just going about the job in the rigorous and fair way they have always done.

Marking rests on a professional judgment that is informed by standards laid down within a discipline - no more and no less. That is why it is difficult to boil down to a set of rules and apply it consistently. There have been various claims that marking in universities is "inherently frail" (Sue Bloxham in The Myth of Marking) - with variations ranging from fail to borderline first - and the Quality Assurance Agency has weighed in to state that institutions have only "weak control" over marking.

At a time when students invest thousands of pounds in getting a degree, the stakes are high, and there will inevitably be an increased focus on assessment. Just check out the essay banks on the internet and see how they market their wares as an investment in the future. And websites such as MarkMyEssay.com offer a two-tier service (up to £480 for an 8,000-word masters dissertation) to check an essay before it even reaches an academic.

Added to that, of course, is the real possibility that students could seek legal redress for what they see as an unfair mark, as has happened in the case of A levels.

We do not pretend that our experiment using a first-year philosophy essay is scientific; the results can give only a snapshot. And in the real world, of course, the essay would be subject to moderation and/or second marking.

Nevertheless, the outcome was surprising: the marks ranged between a fail and a good upper second, with most of them coalescing around a 2:2/2:1.

The fail came from a marker at a new university who spotted that the student writer had plagiarised a text. But by the marker's own admission, the case was a borderline one and it was considered to be a case of extensive paraphrasing (the source text, incidentally, was recognised by only one of our markers) with insufficient referencing. In the academic's opinion, the essay fell into the category of "plagiarism and/or poor study skills", and the student would have been asked to resubmit. In sniffing out the plagiarism, the canny academic surpassed not only the other peer markers but also Turnitin, the well-known plagiarism-detection software.

Overall, however, the comments from the markers reveal a general consensus about the strengths and weaknesses of the essay. Yes, there will always differences of opinion, and marks will reflect that. Many academics freely admit that they often just get a instinctive sense of what mark a piece of work deserves. But is that inherently wrong? Or is that just what you would expect from a marker with years of experience?

What may be surprising to some is the fact that the lowest marks, even when the zero for plagiarism is excluded, came from academics in post-1992 institutions. So much for the claim that the new universities assess less rigorously than do the old.

The results of our experiment are not as wide-ranging as the marking system doomsayers would have predicted, and journalistically they're not headline-grabbing, either (little variation in marking among academics shock). But for universities the outcome is heartening and perhaps an indication that we should trust professional judgment and treat with caution calls for the whole system of assessment to be scrapped.

ann.mroz@tsleducation.com

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.

Sponsored