WUR 3.0: how we could more fairly assess teaching

Duncan Ross explores how the next iteration of the World University Rankings methodology may judge teaching

February 21, 2020
Source: iStock

When the Times Higher Education World University Rankings was first conceived, one of the objectives was to make the assessment much more comprehensive than existing global league tables, which had (and still have) a tendency to include only research metrics.

There is a straightforward reason for this – it is relatively easy to compare research performance across national boundaries. Indeed, research often involves collaboration between scholars in different countries and many publications are read by academics across the globe.

However, we wanted to at least nod in the direction of other areas of university activity to provide a more thorough overview of institutional performance. So, alongside our research environment and citations pillars, we included three others: industry income, international outlook, and teaching environment.

I will explore some potential changes to industry income and international outlook in future blogs, but in this article I’m going to focus on teaching.

Our five teaching environment metrics make up 30 per cent of the overall score in the ranking:

  • Teaching reputation
  • Academic staff-to-student ratio
  • Doctorate-to-bachelor’s degree ratio
  • Doctorates-awarded-to-academic-staff ratio
  • Institutional income per academic staff member

We look at full time equivalent numbers of students and staff, rather than the simpler measure of headcount.

We are currently exploring changes to the doctoral measures and to the income metric. The two doctoral measures are there to support the Humboldtian idea that teaching is improved by access to research, and to show that we expect our research universities to balance the production of the next generation of researchers with undergraduate teaching.

But these measures also have weaknesses. We give universities with the highest shares of awarded PhD degrees the highest scores, meaning that we value efficient use of staff, but is this fair? What would be the perfect proportion of doctoral degrees? Meanwhile, the doctorate-to-bachelor’s indicator says more about the shape of the university than anything related to quality.

An alternative we are considering is replacing these two metrics with a measure of doctoral progress, potentially framed as the share of full-time equivalent doctoral degrees completed within a specific time period. In other words: what proportion of a cohort is able to successfully complete a PhD.

Unlike undergraduate degrees, which are frequently time-limited, doctorates are – at least theoretically – open ended. While this is frequently not the case in reality, largely due to funding, it does raise the question of what is a reasonable length of time to finish a doctorate.

In the US, as fans of the comedy film Animal House will know, an undergraduate degree can be extended for many years. As a result, when assessing undergraduate progress it is relatively rare to use the percentage of degrees awarded at 100 per cent of the expected time (for example, four years for four-year degrees). Instead, it is far more common to use the 150 per cent time period.

It would seem reasonable to take this approach when measuring doctoral progress too. We might hope for a PhD to be completed within four years, but would be confident that for most students it will be achieved in six. There is an additional question of whether the completion time should vary by subject.

Another change we are exploring is whether to replace the institutional income metric with a more precise measure.

In our teaching-focused Wall Street Journal/THE College Rankings, which ranks only colleges and universities in the US, we include an indicator on expenditure on teaching and teaching-related activities instead of one on total institutional income. It is easy to do this because these data are recorded within the US government’s Integrated Postsecondary Education Data System.

If we were to introduce this metric in the World University Rankings it would be normalised by subject; we wouldn’t want to penalise arts-focused universities or courses, which tend to be less expensive to run. But could universities provide the data?

Even in the US there are potential difficulties related to whether teaching expenditure at public university systems is recorded as a total for the entire system or broken down by individual campuses.

There is also a question of fairness: would such a metric further reward wealthy universities or, even worse, encourage deliberately inefficient spending?

These are all issues we will be looking to address over the coming months.

Do you have ideas about how we can improve our rankings? Send suggestions and questions to us at profilerankings@timeshighereducation.com.

Duncan Ross is chief data officer at Times Higher Education.

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please Login or Register to read this article.

Related articles

Reader's comments (1)

Many of these indices are not evidence based. See Hattie's meta-analysis on the factors influencing learning: https://www.evidencebasedteaching.org.au/hattie-2017/ For example, staff-to-student ratio was listed as one indicator. In Hattie's meta-analysis, the effect size for teaching effectiveness for this factor was d=.15. This is small compared to the student's self-reported grades, which was d=1.33. This is what we have known empirically (but are in denial all along) - if we admit people with good academic potential, they are likely to learn better. Like a 'Duh' moment here. I remember Hattie's comment was that decreasing disruptive behaviour among students has a bigger influence on learning outcomes (d = .34) than having a smaller class. So a university that enrols more students with behavioural problems needs to spend more effort addressing this, else it leads to poorer teaching outcomes. Teacher collective self-efficacy has an effect size of 1.39 - why don't surveys include this instead? Everyone asks the students but no one asks the instructors... appears odd to me when the evidence is there.


Featured jobs