Statistics in time of Coronavirus: How do you make sense of a data avalanche?

Each country has its own degree of errors based on political and medical infrastructure; hence comparing numbers between countries would not be objective.

There is a flood of coronavirus statistics all over the internet, making people wonder which one reflects the reality. The benchmark data sets are those on the daily rise in infections, deaths and recoveries in various countries and regions. But as a Guardian report says, there are many limitations to this model despite the flurry of statistics.

Face Masks
Pixabay

The aggregators of these numbers just take them from the source regions, but they can be a subject to skepticism, as we wouldn't know the testing methods of a particular region to be sure, as it depends on factors like number of tests carried per day and till date. In case of Germany, 1.3 million coronavirus tests done until April 9, while UK did about 0.317 million tests.

Delays and errors

In case of deaths, mostly those reported only in hospitals with the COVID-19 patients are often counted. Factors like report delays come to play and are more likely in countries with huge coronavirus cases.

On March 27 England announced that there were 926 COVID-19 deaths, but later NHS reported that the true number was 1,649. Thus daily counts are prone to error. Thus, World in Data has a rolling three-day average to ease the errors.

Added to the above, each country has its own degree of errors based on political and medical infrastructure, and so comparing numbers between countries would also not be objective.

Royal Statistical Society's Statistical Ambassadors gave key points to consider while looking at these numbers:

  • Confirmed cases will be less than actual cases.
  • Comparisons of case and death data between countries may not be meaningful.
  • Models produce estimates with plausible ranges. These models can help us understand the likely effects of policies.

Model based stats

Computer models are of two types. The first assumes one person on average infects so many per day and then takes mitigation measures. The second is the empirical type that fits the curves to data obtained, and then extrapolates by assuming various curves and predict scenarios.

SARS-CoV-2
Novel Coronavirus NIAID / Creative Commons

On Thursday, the Institute for Health Metrics and Evaluation (IHME ) of US predicted, assuming full social distancing through May 2020, that 66,000 COVID-19 deaths could happen. But after three days the number was revised to 37,000, almost half of the previous.

Mortality displacement

The current estimates of risk for common public is almost similar to the usual risk we face annually, but squeezed to few weeks, according to a piece. Also owing to lives saved because of coronavirus mitigation measures such as fewer accidents and improved air quality among other perks, some say those who are dying would have died in the next year. This is called as mortality displacement. One may even add that those with existing health conditions prone to death by coronavirus have high chances of death because of their condition. But even this should be subject to skepticism.

Related topics : Coronavirus
READ MORE