Geology and Geostats Red Pennant Geoscience Blog Updates and Points of View

“Do Ya Feel Lucky?” – The COVID-19 trajectory graphs plotted with R-script

Published on May 4, 2020 on LinkedIn
“Do ya feel lucky?”

Mike O’Brien wearing the right protective gear.

By Mike O’Brien
Director at Red Pennant Communications Corp. – RP Geoscience

I have been looking at the COVID-19 pandemic from a somewhat statistically-informed, but medically-ignorant perspective. My reason for doing so is to try to make sense of the daily news headlines from my own all-too-mortal perspective. I take no responsibility for any conclusions you may want to draw about mortality rates, herd immunity or lockdowns.

It would be good to be able to compare the COVID-19 trajectory or history from objective statistics. Sadly, there will always be differences in the way the number of cases and mortalities are reported and collected and there is going to be a bias between regions and countries in the way the data is compiled.

Daily starting point

My daily starting point (best worst source) is the data from the European Center for Disease Control and Prevention (ECDC) (automatic download using R software). ‘Our World in Data’ are relying on the same source for various reasons.

I look at graphs plotted on a log scale (e.g. 10 represented by 1 unit, 100 by 2, 1000 by three and so on) for numbers of cases and mortality as we are informed of the exponential growth of epidemics and exponential functions will approximate to a straight line.

Cumulative / progressive figures are plotted against time (days) to get an idea of the totals without the daily or weekly noise created by organizations only issuing figures on arbitrary days.

Graphs of cases and mortality using R-script

Starting at the source, the pandemic has started its growth phase in different countries starting at different times. I have ‘standardised’ the start date for each country to be able to approximately compare ‘trajectories’. I have set ‘day 0’ for the ‘cases’ trajectory for each country as the day on which a cumulative total of 200 cases had been officially recorded. I have set ‘day 0’ for the ‘deaths’ trajectory for each country as the day on which a cumulative total of 50 deaths had been officially recorded. These are arbitrary numbers arrived at by experimentation, but I see a number of other people and organisations are using a similar approach. This is the weakest part of the process.

But the graphs of cases and mortality will not be comparable for large and small countries, so population needs to be taken into account. To do this I divide the cumulative totals by the populations of each country and plot the graphs on a percentage of population basis. I realise that the demographics of each country, will have probably profound effects on the results but this all amounts to crudely looking for patterns rather than a rigorous analysis.

I put this together using a heavily modified R-script originally developed by Seb Heinz. (

The plot of the relative % cases for some selected countries is shown below:

May 05, 2020 COVID-19 cases (Relative cases with standardized day 0 = 200 cases)

What do the standardized relative cases graphs show? Spain/ USA/ Italy/ UK/ Sweden/ France/ Canada heading for apparently similar outcomes with the percentage of cases topping out at roughly 0.5% of population. Iceland was in a similar position in much less time, but Iceland is a very small country. Lots of unanswered questions – Does the flattening of the curve reflect the development of ‘herd immunity’?
How can ‘herd immunity’ be achieved if only 0.5% of a population have been infected?
Are the number of cases seriously under-reported?
Have Australia, South Korea and Japan stopped the spread or is it temporarily stalled?
Do the country and regional differences reflect different strains of the virus?

Simplistically and subject to data inconsistencies; it does not appear that there are significantly different outcomes in Europe & North America for different levels of lockdown (e.g. Sweden, US, UK, France). It takes about 60 days for the current wave to pass (no guarantee that there will not be subsequent waves). Australia and Iceland appear to be exceptional and reached some sort of equilibrium within 20 days. (Are islands different?)

The plot of the relative % deaths is shown below: NOTE: this is based the percentage of population, NOT percentage of cases (‘mortality rate’). The reasons for not estimating a mortality rate per positive case are that this depends on the amount and type of testing that has been carried out which, in turn, depends on local testing policies, the time lag for cases to develop to critical and the advance of testing technology.

(My numbers for the various countries reveal estimated very variable and time-dependent mortality rates between 1 and 20%, for what it is worth).

The cumulative relative deaths graph is less affected by testing policies and technology, but it is still probably based on far from perfect data.

May 05, 2020 COVID-19 deaths

What does the standardized relative deaths graph show?

Spain/ USA/ Italy/ UK/ Sweden/ France/ Canada heading for apparently similar outcomes with the percentage of cases topping out at roughly 0.05% of population.  It appears that the overall average risk of death is low – but that is little comfort to people in the high-risk categories (based on age and pre-existing conditions).  In case you are wondering; Iceland has had fewer than 50 deaths so far, so is not plotted on the second graph.

Question: Are Australia, South Korea and Japan permanently stopping the spread or is there another wave coming?

‘Do ya feel lucky?’ You should do. If you read this; congratulations, you have most likely survived the main 60-day pulse of the first wave in your location.

‘Wanna continue to feel lucky?’ Be sensible, wash your hands and be socially distant. What harm can that do?


R Core Team (2013). R: A language and environment for statistical  computing. R Foundation for Statistical Computing, Vienna, Austria.

Heinz, Sebastian, 2020. Making Of: A Free API For COVID-19 Data. April 1, 2020

%d bloggers like this: