Our PROMISE: Our ads will never cover up content.

Our children thank you.

Customer Care

Published: Monday, May 4, 2020 - 11:03

All articles in this series:

Each day we receive data that seek to quantify the Covid-19 pandemic. These daily values tell us how things have changed from yesterday, and give us the current totals, but they are difficult to understand simply because they are only a small piece of the puzzle. And like pieces of a puzzle, data only begin to make sense when they are placed in context. And the best way to place data in context is with an appropriate graph.

When using epidemiological models to evaluate different scenarios it is common to see graphs that portray the number of new cases, or the demand for services, each day.^{1} Typically, these graphs look something like the curves in figure 1.

While these epidemiological models provide some understanding and insight, they are rarely as complex as reality. Moreover, these nice models create expectations for the data. When we plot the daily values from an epidemic, we begin to look for the peak. We want to know if we are “over the hump” and if the epidemic “has run its course.” As a result, all kinds of complex ways of analyzing the data have been created in the effort to “find the peak.”

But data are always historical. Therefore, any predictions based on the data are only as good as the assumptions behind the computations performed. On the other hand, our data describe the current reality. While they almost always do this imperfectly, they will always supersede the predictions. So the problem is one of how to use the daily data for real-time feedback. How can we understand the progression of an epidemic without becoming lost in the complexity of models and their assumptions? The simplest answer is to plot the data in time-order sequence as we do here.

Figure 2 is the data-based equivalent of figure 1. It shows the numbers of new confirmed cases of Covid-19 each day. In creating figure 2, we began with the six countries that currently have the highest numbers of confirmed cases of Covid-19. These are the United States, Spain, Italy, Germany, France, and the United Kingdom. While these six countries contain less than 9 percent of the world’s population, they account for more than 60 percent of the world’s confirmed cases of Covid-19.^{2} Since the five European countries in that list have a combined population that is essentially the same as that of the United States, we have combined the five European countries to have comparable numbers for the graphs.

So, have we reached the peak? For both curves we have to ask which of the five peaks is “the peak?” While the European time series may be slowly drifting down (in spite of England’s efforts to the contrary), the U.S. numbers are all over the place, ranging from 21,000 to 48,000 in the last week. About all that we can learn from figure 2 is that, in this pandemic, the curves climb up more quickly than they subside.

The excessive amount of variation inherent in figure 2 makes it hard to interpret these daily values directly. We immediately want to smooth things out. While we could resort to various smoothing algorithms, the simplest way to smooth things out is to plot the daily cumulative totals.

Since epidemics tend to display exponential growth, the commonly preferred way to plot cumulative totals is to use a semi-log plot. In addition to keeping the numbers on the graph, the semi-log plot has the added virtue of explicitly showing the growth rate for the epidemic. It is the rate of growth, and changes in the rate of growth, that are key parameters for understanding an epidemic. (For more on how to create a semi-log plot see the appendix.)

On a semi-log plot a fixed rate of growth (exponential growth) will always plot as a straight line. Thus, regardless of where they are on the graph, parallel line segments will always represent the same rate of growth. This means that *any persistent departure from a straight line will represent a change in the rate of growth*. It is this feature of the semi-log plot that makes it self-interpreting. As the plot of the cumulative totals forms a curve, the rate of growth is changing. No further analysis is required to make sense of the data.

Figure 3 shows the total number of confirmed cases for the Covid-19 pandemic for selected countries. As these curves flatten, we see what has happened in the past and have a basis to understand what is likely to happen in the future.

In figure 3 we can see the first wave of infection in China, was followed by a second wave as the infection began to spread around the world in February. The jump on Feb. 13, 2020, resulted from a one-time adjustment (of 15,141) in the total number of cases reported by China. As China flattened out its epidemic growth rate, the disease spread around Asia, the Middle East, and into Europe. So on Feb. 20, the worldwide total began to pull away from the curve for China as the second wave of infection began.

In the week between March 8 and March 15, we see that South Korea had dramatically flattened its curve while the pandemic was rapidly growing elsewhere. In that week the world had an average growth rate of 4.5 percent per day, the five European countries had an average growth rate of 25 percent per day; the United States had an average growth rate of 33 percent per day; Norway averaged 34 percent per day; and Australia averaged 16 percent per day; while South Korea only averaged 3.4 percent growth per day.

As we follow each of these curves we see that they continue to flatten as the rate of growth slows down. By the end of April, South Korea, Norway, and Australia had reduced their average growth rates to 0.1 percent, 0.7 percent, and 0.2 percent per day, respectively. These three countries show that democracies can indeed suppress the Covid-19 pandemic. The United States ended April with an average growth rate of 2.5 percent per day, Europe had 1.2 percent per day, and the world 2.4 percent per day. So while everyone has improved, not everyone has improved by the same amount or to the same extent.

By making the growth rates visible the semi-log plot provides the needed real-time feedback about the status of the pandemic. It lets us see the big picture on any scale: local, state, nation, region, or worldwide. The semi-log plot of the total counts helps us to understand what the daily numbers mean by putting those numbers in context.^{3, 4}

Figure 4 tells the story of a natural experiment. Sweden and Norway are two very similar countries with two different approaches to interventions. Both countries first exceeded 100 cases on March 7, and both countries had very similar curves with an average growth rate of about 10 percent per day in the latter half of March. On April 1 Sweden had 4,435 cases while Norway had 4,447 cases. However, Norway began to flatten its curve on March 29. By the end of April, Norway had lowered its average growth rate to 0.7 percent per day, and had 7,667 confirmed cases.

Sweden began to flatten its curve around April 12, but the Swedes were less aggressive in their efforts.^{5} By the end of April they had lowered their average growth rate to 2.9 percent and had a total of 20,300 cases. Thus, while starting from the same amounts, in one month Norway added only 3,200 new cases while Sweden added 15,900. Such is the marvel of compound growth rates.

With a growth rate of 2.9 percent per day Sweden is on track to double its number of confirmed cases in 24 days (May 24). If the Swedes can further flatten their curve they can postpone this doubling and gain time before reaching 40,000 cases. With a growth rate of 0.7 percent per day Norway’s doubling time is 99 days. If the Norwegians maintain this rate of growth, they will not reach 15,000 cases until mid-July.

Figure 5 shows the curves for Japan and South Korea. Unlike South Korea and every other country in figures 3 and 4, Japan did not experience a rapid growth phase. While most countries show growth rates in the range of 30 percent to 50 percent per day by the time they reach 100 cases, Japan just meandered around through March and the first half of April with an average growth rate of 9 percent. While the curve did occasionally flatten out, it always picked right back up in a day or two. The first sustained changes in the slope of the curve occur in the last two weeks of April. By the end of April, Japan had 14,088 cases and an average growth rate of 1.7 percent per day.

To put Japan’s 14,088 cases and growth rate of 1.7 percent in perspective, the state of Tennessee, with only 5 percent of Japan’s population, ended the month of April with 10,733 cases and a growth rate of 3.1 percent. If things don’t change, Tennessee could overtake Japan before the end of May!

The whole purpose of interpreting the daily numbers is to determine whether things have gotten better or worse. The variation in the daily numbers of new cases prevents us from using them to directly answer this question. But when these same numbers are combined into cumulative totals and plotted on a semi-log plot, our answer is made clear by changes in the angle of the curve. In this way we gain insight without resorting to complexity.

Epidemiological models reveal the realities of exponential growth, and provide the perspective needed for planning. But the historical record is made visible when we plot the data on a semi-log plot. As we can see in figure 3, the first wave in China gave the warning. The second wave is occurring as the pandemic spreads around the world. At present this disease has manifested itself throughout those countries that are economically connected. What lies ahead remains uncertain, but we can track what is happening by continuing to plot the daily totals on a semi-log plot.

And we need to continue to try to flatten our cumulative curves simply because a flatter curve buys more time. Now, as we have seen, when we slow the exponential growth of the number of new cases, that is, when we get off the ascending portion of the curves in figure 2, the curves in figure 3 will naturally start to flatten out. The more we follow practices that slow the transmission of the disease, the smaller we make the daily number of new cases, the lower our growth rate becomes, and the flatter we make the curves in figure 3. While this worldwide, unintended experiment continues we may not know whether we have passed the peak, but we can always tell if the growth rate is increasing or decreasing by simply plotting the total numbers on a semi-log plot. No assumptions necessary, no models needed. Just listen to the facts in context.

You can download this Excel spreadsheet, enter the daily numbers yourself, and have a semi-log plot generated for you.

To get an average daily rate we usually take the last three day’s values, divide each by the previous day’s value, and average these three rates.

When the line is reasonably straight, doubling times may be found by simply counting the number of days between one value (say 350) and its double (700). Alternatively, a doubling time may be directly computed from the average daily rate by:

Doubling Time in Days = [the logarithm of 2.000]/[the logarithm of the average daily rate]

1. “Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand,” Imperial College, London task force

2. Website of the European Centre for Disease Prevention and Control

3. “Tracking Covid-19,” Donald J. Wheeler, Al Pfadt, Kathryn J. Whyte, *Quality Digest 2020, *April 6, 2020

4. “Covid-19 Update,” Donald J. Wheeler, Al Pfadt, Kathryn J. Whyte, *Quality Digest 2020, *April 13, 2020

5. “Sweden has nearly 10 times the number of COVID-19-related deaths than its Nordic neighbors. Here’s where it went wrong,” April 20, 2020

## Comments

## The Math of COVID-19, and Factories

You might be interested in the following post, in which I summarized what the pandemic has made me learn about statistical epidemiology:

https://michelbaudin.com/2020/05/06/the-math-of-covid-19-and-restarting-...

## An interesting contrast to our approach

I hope it is not bad form to be the first person to comment on our article, but I wanted to call attention to a link that I just discovered today that represents an entirely different approach to the one Don and I illustrate in this article.

The web address http://covid-measures.stanford.edu/ presents interactive models that permit one to " try out" different assumptions regarding various aspects of Covid 19 in terms of certain parameters such as demand for hospital beds or the anticipated number of fatalities associated with certain containment scenarios.

I hope readers don't get the impression from our article that we reject the utility of such models.

However, they serve a very different purpose than the one we take and both should be thought of as complementing each other.

## Worldometers data

I have been tracking cases and deaths using the Johns Hopkins data; after reading this excellent article I am inocorporating your approach into my tracker, so I really appreciate your discussion of exactly how you ran your analyses. I also found the Worldometers data very interesting, and I can't help but notice that they have an "active cases" field there, and a testing field. Those numbers were interesting to me and I wanted to add them to my tracker, but can't find a link to any older data.

The "active cases" field, I think, could be more useful as a general indicator than cases or new cases. Assuming that the operational definition includes those who are known to have the virus and are either under observation or quarantined, and that people considered "cured" or "recovered" have been dropped from the counts, this number could be a more representatie indicator of the current impact in a given area. I guess (becasuse they have the number on the page) that they somehow also calculate the number of people who have recovered. That would be a good number, too. Of course, we'd need to know how much accuracy and precision there are around the data.

Do any of the authors have any insight into where we might get any past data from Worldometers?

## Worldometer

## This was an old one

I stopped using Worldometers, because JH stopped using them when they stopped tracking a couple of months ago. Now I get the JH data from their latest source at Github. There are a few tracking sites where you can mouse over and see extra things in the tooltips. Doing that for 2300 + counties would be too much work.