Excess mortality: England is the European outlier in the Covid-19 pandemic
Measuring excess mortality: England is the European outlier in the Covid-19 pandemic
Excess mortality is a count of deaths from all causes relative to what would normally have been expected. In a pandemic, deaths rise sharply, but causes are often inaccurately recorded. The death count1 attributed to Covid-19 may thus be significantly undercounted. Excess mortality data overcome two problems in reporting Covid-19-related deaths: miscounting from misdiagnosis or under-reporting of Covid-19-related deaths is avoided. Excess mortality data include ‘collateral damage’ from other health conditions, left untreated if the health system is overwhelmed by Covid-19 cases.
Why is it important to examine excess mortality data?
Excess mortality data can be used to draw lessons from cross- and within-country differences and help analyse the social and economic consequences of the pandemic and relaxing lockdown restrictions. For country comparisons (where under-recording may differ), policymakers should examine robust measures expressed relative to the benchmarks of ’normal’ deaths. Normal death rates reflect persistent factors such as the age composition of the population, the incidence of smoking and air pollution, the prevalence of obesity, poverty and inequality, and the normal quality of health service delivery.
Estimating the R0 is crucial for assessing the rate and nature of relaxation of lockdowns.2 Excess death figures also help to avoid the measurement biases inherent in other data used to estimate R0 used in epidemiological models.3
How is excess mortality measured and who measures it?
National statistical agencies publish actual weekly deaths and averages of past ‘normal’ deaths. The Office for National Statistics (ONS) reports ‘normal’ deaths for England and for Wales as the average of the previous five years’ deaths. There are no published benchmarks for more granular or disaggregated data, such as sub-regions or cities. Using the weekly historical data, researchers could calculate such benchmarks with some effort.
To obtain cross-European comparisons requires data collation from individual national agencies – unless the Z-scores compiled by EuroMOMO4 for 24 states are used (see Appendix 1). EuroMOMO’s timely measures of weekly excess mortality in Europe allow comparisons of the mortality patterns between different time periods and countries, and by age groups. The Z-scores standardise data on excess deaths by scaling by the standard deviation of deaths. EuroMOMO are currently not permitted to publish actual excess death figures by country and do not publish the standard deviations used in their calculations. However, they graph the Z-scores and the estimated confidence intervals back to 2015 providing a useful visual guide to their variability. In the US, the National Centre for Health Statistics publishes data on excess deaths and P-scores (see Appendix 1), defining excess deaths as deviations from ‘normal’ deaths plus a margin adjusting for the uncertainty of the data.5 These data include counties and states, and are disaggregated by gender, age and ethnicity. The NCHS thus sets an international standard for statistical agencies.
At least three separate journalistic endeavours have recently engaged in the time-consuming effort of collecting and presenting more transparent excess mortality data (see Table 1). The Financial Times plots numbers of excess deaths, and the P-score or percentage of deaths that are above normal deaths; the Economist shows figures and graphics for excess deaths but not P-scores. However, the published estimates of P-scores in newspapers give only a recent snapshot, missing the context of historical variability provided by EuroMOMO. And we only have P-scores for some countries, regions and cities.
In contrast to the P-scores, the Z-scores are a measure that is less easily interpretable. Moreover, if the natural variability of the weekly data is lower in one country compared to another, then the Z-score could lead to exaggeration of excess mortality compared to the P-score. While this could be a problem, selective comparisons below between Z-scores and P-scores reveal that England had the highest number of excess deaths using both measures of all the European states covered. Therefore, comparisons of Z-scores remain highly relevant.
Table 1 Three journalistic endeavours to examine comparative excess mortality data for Europe, the UK and the US, and other countries (for ‘all age groups’ only)
Table 2 EuroMOMO Z-scores for poor performers showing peak weeks of excess mortality by age group
Source: Z-scores extracted from the EuroMOMA graphs, 16-May-2020. Notes: (i) The peak weeks for different countries are in bold. (ii) The country ordering is by peak mortality. (iii) The ONS defines a week as ending on Friday; Public Health England (PHE) and EuroMOMO define a week as ending on Sunday.
A first look at data for England: total figures and the over 65s
England eclipses all 24 countries covered by EuroMOMO in excess mortality scores. At the peak, England’s total Z-score based on actual deaths was 42.8 (week 15 in Table 2). The ONS records 21,182 registered deaths for the comparable week compared to a normal number of 9787 (averaging the previous five years).6 This gives excess registered deaths of 11,395, and a P-score of 1.164. For the same week, the ONS registered 8335 deaths as Covid-19-related, accounting for 73% of excess deaths. Data on actual deaths, reported by The Economist, give a peak P-score of 1.134.
England’s peak rate of excess deaths for the most vulnerable age group, the over-65s, is also the highest (Table 2, and Figure 1 for the next worst affected European countries). England’s far higher excess mortality scores than the rest of the UK are shown in Figure 2; these countries’ scores are lower than Spain, Belgium, Italy, Netherlands, and France. Italy initially dominated the headlines for Covid-19-related deaths but ranked fourth for peak excess mortality figures for the over-65s, below Spain and Belgium. In contrast, Germany, throughout the nine weeks in Figure 1 showed excess mortality well within the -2, +2 normal range.7
As a spot-check, P-scores were calculated from actual deaths and normal deaths, reported by The Economist. Peak P and Z scores are compared in Table 3. Within Europe, the rankings almost coincide.8
Table 3 Peak week of excess mortality: Country P-scores and Z-scores compared
Sources: EuroMOMO Z-scores for the peak week of excess mortality, see Table 2. The peak weeks for the all age group category is the same for the UK countries. The P-Scores for the peak week of excess mortality are calculated by the authors from newly-released country data by The Economist GitHub. The peak week timing for P-scores and Z-scores coincide for all countries.
Figure 1 Recent weeks of Z-scores for poor performers showing peak weeks of excess mortality, by age group
Source: Z-scores extracted from the EuroMOMA graphs, 16-May-2020. Their week 19 data are omitted, as the most recent week’s data tend to be heavily revised. Country ordering is by peak mortality for all age groups.
Figure 2 Recent weeks of Z-scores for the UK: England, Scotland, Wales and Northern Ireland
Source: Z-scores extracted from the EuroMOMA graphs, 16-May-2020. Their week 19 data are omitted, as the most recent week’s data tend to be heavily revised.
Excess mortality for the 15-64 and 65-74 age groups
Most disturbing is the comparative story for the 15-64 age group, where England’s relative record in excess mortality in the Covid-19 era is strikingly higher than in the European countries. The 15-64 age group includes the mass of the working age population. At its peak in week 15, it is 2.8 times worse than the weekly peak in next worst country, Spain, around 4 times worse than France and Belgium, and more than 5 times worse than in Italy, Table 2. Within the UK, excess deaths for this age group are also strikingly worse for England than for the other nations, see Figure 2. Puzzling too, is that Z-scores in the 65-74 age group for England, see Table 2, are similar to the 15-64 age group. By contrast, in the five European countries, excess deaths in the 65-74 age group are about twice as high as for the 15-64 age group, though still below the 65+ age group.
England is the only country in Europe, for which Z-scores for the 15-64 group had not decreased below about 2 by week 18, ending 3 May.
What can be learned from the within-UK comparisons of the Z-scores?
To interpret large differences in excess mortality between nations requires consideration of three main factors, and the within-nation deviations in these factors: the average infection rates in preceding weeks, average mortality risk from Covid-19 and constraints on Covid-19-specific health capacity.9 London’s international connectedness and the timing and London-centric location of the spread of the infection help explain England’s worse performance than Wales, Scotland and Northern Ireland. Regional data show a clear rise in London’s excess deaths ahead of other regions, and a peak P-score of 2.37, far above England’s 1.13. London’s population density and public transport system mean social distancing is harder to achieve. The undeniably late and initially unclear application of social distancing and delayed lock-down measures (Conn et al. 2020, Horton 2020) then had a far worse outcome. Generally, there was a collective failure in preparedness across the public health system, especially for testing capability and adequate supplies and distribution logistics of personal protection equipment (PPE) for health workers (Foster and Neville 2020). The late recognition of the need to provide care-homes with PPE and tests has received recent attention. Regions outside London mostly fared better, though the West Midlands and the North West, the next largest conurbations, eventually had the next highest excess death rates (data from the FT and The Economist). This underlines the roles of timing and urban density. Infection rates outside London may have been at lower levels, so when social distancing and lock-down measures were introduced, they were the more effective.
Data from the ONS on age-corrected mortality rates by location show much higher Covid-19-related death rates in places with the greatest economic deprivation.10 Underlying health is likely worse in these areas and low-paid key workers, more exposed to potential infection, may live there in disproportionate numbers. This is particularly pertinent to the 15-64 age group. High comparative levels of excess mortality in England may also have been affected by ethnic differences in the incidence of Covid-19-related deaths (Khunti et al. 2020, Barr et al. 2020).
Further clues stem from ONS data on death rates by occupational groups. It is possible to examine the deviation from the average Covid death rate per 100,000 by occupation. Figure 3 shows death rate ratios to the average rate for the worst affected, mostly low-wage occupations, both for Covid-19 deaths and deaths from all causes, during the pandemic period to 20 April. The two ratios are strongly correlated and are a good indirect indicator of the corresponding rates of excess mortality.
For male transport workers, above-average mortality rates are pronounced for taxi drivers followed by bus drivers. Security guards and male care workers fare even worse. Among women, care workers have around 2.4 times the average death rates for working age women. For nurses, the ratio is 1.3, suggesting that staff in care-homes were particularly badly protected.
Figure 3 Deaths involving COVID-19 and all causes among (selected) occupation groups by gender (aged 20 to 64 years), England and Wales
Source: ONS release, Coronavirus (COVID-19) related deaths by occupation, England and Wales, 11-May-2020. Notes: (i) Age-standardised rates of death per 100,000 population. (ii) Figures for the most recent death registrations, deaths involving COVID-19 registered between 9th March and 20th April.
What can be learned from the cross-country comparisons of the Z-scores?
There is heterogeneity between nations and, as seen above, within nations. Ideally, to draw valid lessons for health policy and easing lockdown requires like-for-like comparisons, controlling for differences in population density, average age, prevalence of diabetes and smoking, wealth factors, amongst others. This reinforces the need for disaggregated excess mortality data to sub-regional levels across Europe and the UK.
Nevertheless, one can draw lessons from the aggregate EuroMOMO Z-scores. Excess mortality rates rose first in Italy, then Spain and closely followed by England, Belgium, France and the Netherlands. Germany never had weekly Z-scores outside the -2, +2 normal range.11 The interaction of multiple factors is likely to account for Germany’s better comparative record and helps explain outcomes in the high excess mortality countries. Ramping up testing capacity from January enabled effective tracing and isolation and kept down the rate of infection. Germany managed largely to keep the virus out of care-homes with their high mortality risk, evidenced by the lower median age of those infected. Higher capacity in health service delivery, measured by higher numbers of hospital beds, ICUs, and ventilators per head of the population, combined with the lower rate of infection, meant that Germany’s health system could deliver better care, reducing mortality. The health benefits of flattening the pandemic curve, with later economic benefits, were emphasised by Anderson et al. (2020) and Gourinchas (2020). Our longer paper examines the UK record, including the late application of social distancing and insufficient attention to early warnings from critics and scientific experts (Aron and Muellbauer 2020).
EuroMOMO and national statistical agencies should publish improved measures of excess mortality
Forecasting P-scores from epidemiological models for different scenarios on ending lockdown measures should be an important aid to formulating policy.12 Granular data by location within and between countries must be produced and made accessible for research and forecasting. An example using granular Italian death registry data is Ciminelli and Garcia-Mandicó (2020).13 Belloc et al. (2020) caution against drawing simplistic conclusions from cross-country correlations; they too stress the need for granular, comparable data.
National statistical offices should publish weekly P-scores of excess mortalities for the constituent countries, regions and broad social groupings such as care-home residents, to help understand the pandemic and inform policy.14 P-scores are more salient and interpretable than Z-scores. They avoid the possibility that Z-scores may under-represent excess mortality where death data are noisier.
We argue that EuroMOMO should be permitted to produce P-scores as well as Z-scores to aid comparability across countries. EuroMOMO’s five-year graphs of Z-scores visualise the natural weekly variability, helping to interpret the confidence intervals. Similar practice should be followed for published P-scores, including at national statistical agencies.
To end on a cautionary note, excess mortality should also be examined in a longer-term perspective. Spiegelhalter (2020) argues the main impact of Covid-19 may be to shift forward the date of death by a few months for those close to death because of underlying poor health. Then, a peak in weekly deaths should be followed by a trough, see Table 2 for hints that this may be occurring in some countries. However, total years of life lost is a better measure of the pandemic’s social toll. Even in the extreme case envisaged by Spiegelhalter, if the 12-month moving average of excess mortality showed no deviation outside the -2, +2 normal range, total years of life lost could still show an upturn.
If national statistical agencies regularly published monthly, 3-month, 6-month and 12-month moving averages, and weekly P-scores, this would greatly assist our ability to interpret the pandemic data.15 Provision of timely, regularly updated and comparable granular data on excess mortality by national and international statistical agencies should be high on the agenda. It is not enough to leave this to hard-working journalists.
Authors’ note: We are grateful to Eric Beinhocker, Gerry Kennally, Max Roser and David Vines for comments.
Appendix 1 Measures of excess deaths: comparing and contrasting the Z-score and the P-scores
Denote the number of weekly deaths by x.
The P-score is defined as follows:
(x minus the expected value of x for the population), divided by the expected value of x for the population
A variant P-score (U.S. National Center of Health Statistics) is defined as follows:
(x minus the upper threshold for the expected value of x for the population), divided by the upper threshold for the expected value of x for the population.
The upper threshold is defined as the expected value plus the 2.5% confidence interval for this expected value. This takes into account uncertainty created by the natural variability of x.
The Z-score is defined as follows:
(x minus the expected value of x for the population), divided by the standard deviation for the population of x around its expected value.
EuroMOMO estimate the expected value of each country’s weekly deaths using data for the previous five years, taking seasonal factors and trends into account, and adjust for delays in registration.
For count data, like weekly deaths, a Poisson distribution is a reasonable approximation to the underlying probability, taken into account in the estimated Z-scores of EuroMOMO.*
Graphs published for each country show the weekly Z-scores since 2015 compared to their usual range of -2 to +2, the approximate 95% confidence interval. Around 2.5% of observations would thus usually have a Z-value over 2. The Z-score equals 4 line is also shown, corresponding to a ‘substantial increase’: under usual conditions, the Z-value would exceed 4 only around 0.003% of the time.
The graphs show more deviations of Z-scores ’exceeding 2’ and ‘exceeding 4’, than one would expect. The main reason is that to fit the baseline, EuroMOMO chose only the period of the year when additional processes (e.g. Winter influenza and Summer heat waves) leading to excess deaths are not likely to happen. Normal variability is thus measured after excluding these seasons.**
* The Poisson is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. The calculation is described in Farrington et al. (1996). As the data are probably not time-independent, the claimed probabilities associated with different Z-values are likely to be only approximate. ** See EuroMOMO webpage: “Methods”.
Anderson, R. M., H. Heesterbeek, D. Klinkenberg and T. D. Hollingsworth (2020), “How will country-based mitigation measures influence the course of the Covid-19 epidemic?” The Lancet 395 (10228): 931-934, 21 March.
Farrington, C.P., N.J Andrews, A.D. Beale and M.A. Catchpole (1996), “A statistical algorithm for the early detection of outbreaks of infectious disease.” Journal of the Royal Statistical Society A 159: 547-563.
Favero, C., A. Ichino, and A. Rustichini (2020), “Restarting the economy while saving lives under Covid-19.” CEPR Discussion Paper 14664.
Gourinchas, P. (2020), “Flattening the Pandemic and Recession Curves.” In Richard Baldwin and Beatrice Weder di Mauro (Eds.) Mitigating the Covid Economic Crisis: Act Fast and Do Whatever It Takes, a VoxEU.org eBook, CEPR Press.
1 See the COVID-19 Dashboard by the Center for Systems Science and Engineering (CSSE) here.
2 R0 is the virus reproduction rate, which needs to be kept below 1 to avoid exponential growth of infections.
3 Details on this can be found in the evidence of Prof. John Edmunds to the UK Science and Technology Parliamentary Select Committee on 7 May.
4 EuroMOMO is a European mortality monitoring entity, aiming to detect and measure excess deaths related to seasonal influenza, pandemics and other public health threats. Official national mortality statistics are provided weekly from the 24 European countries and regions in the EuroMOMO collaborative network, supported by the European Centre for Disease Prevention and Control (ECDC) and the World Health Organization (WHO).
5 See the National Center for Health Statistics website here.
6 This week, ending on 17 April, corresponds roughly with EuroMOMO’s peak week 15 (see definitional differences in Table 2). The ONS reports the registration of deaths, which lags the actual deaths. See “Deaths registered weekly in England and Wales”, Office for National Statistics.
7 German data are for Berlin and the state of Hesse and are available from EuroMOMO.
8 Scotland’s peak P-score is higher than that of Wales, though if cumulated over a few weeks, it would be lower.
9 Transmission and rates of infection are also influenced by factors like the nature of social distancing, availability and use of face masks, and cultural differences in the exercise of self-discipline and following of advice.
11 Complete but less timely data for all of Germany, tracked by the FT, confirm Germany and Denmark as having the lowest cumulative excess mortality rates of the 13 European countries compared. The EuroMOMO data for Berlin suggest a dramatic contrast with London.
12 A study which forecasts the one year ahead mortality is Denaxas et al. (2020).
13 They analyse daily death registry data for over 1000 Italian municipalities, which suggest that deaths registered as Covid capture only about half of excess deaths. They find strong evidence that locations where mass testing, contact tracing, and at-home care provision was introduced experienced lower numbers of excess deaths.
14 At more granular levels, the weekly data can become noisy. Averages over longer periods are more informative.