Previous examples have shown how to graph COVID-19 cases, cases per million, deaths, and deaths per million. This example graphs recoveries, recoveries per million, and what is probably the most important value deaths per resolution.
The previous example showed how to load multiple data sets from different downloaded files. The basic approach is to save different kinds of data in separate arrays inside the CountryData class. When the user picks a data set, the program sets countries’ SelectedData references to point to the array containing the selected data. The program then graphs the selected data.
This program does the same for recoveries and recoveries per million.
Approximating Case Fatality Rate
The value we would most like to learn is the virus’s case fatality rate (CFR). This is the ratio of number of deaths divided by the total number of diagnosed cases. Unfortunately we probably can’t learn that value until after the fact, but we can use deaths per resolution as an approximate value.
A resolution is any time a patient is finished with the virus because that patient either died or recovered. Many people and the media are looking at deaths per case. Unfortunately the number of cases includes many that are not yet resolved. Some of those unresolved cases will end in recovery and some will end in deaths, so the ratio of deaths to cases is misleading.
For example, suppose the true case fatality rate is 50% so half of all patients eventually die. Now suppose it’s early in the pandemic and there have been 10 deaths, 10 recoveries, and 80 unresolved cases for a total of 100 cases. If you look at deaths per case, you get 10 deaths per 100 cases or 10%, which is far from the true case fatality rate.
Instead we need to look at deaths per resolution. In this example that would be 10 / (10 + 10) = 50%, which is correct.
In practice it isn’t that simple because resolutions do not always occur at the same rate. For example, suppose it takes on average two weeks for someone to recover but three weeks to die. In that case the number of deaths will be underrepresented at any given moment. To think of it in another way, it means that the resolved and unresolved cases do not have the same mix of deaths and recoveries.
Ideally we could look at all of cases that started before a certain date that is long enough in the past so all of them are resolved. For example, if we knew how many cases started before March 1, it is now long enough after that date that all of the cases should be resolved. Then we can treat those cases as a mini-pandemic and see how many resulted in death.
Unfortunately I have not been able to find a source for that data.
This example loads data for deaths and recoveries, and then adds the to two get total resolutions. It then calculates deaths per resolution. It’s not exactly what we would like, but it’s not a bad proxy. As time goes on, the number of resolved cases will grow so this ratio should become more accurate.
Top Case Countries
The following picture shows the picture at the top of this post with annotations added.
You can see that all of the curves jump around wildly at the beginning when there are few resolved cases. As more cases are resolved, the curves settle down to something that approaches the true CFR. After the pandemic is over, the curves will settle on the true CFRs.
Note that different countries have different apparent CFRs. That is due at least in part to the abilities of the countries to care for their patients. If a healthcare system is overloaded, it cannot give every patient the necessary care so the CFR will be higher.
Some of the differences are also due to inconsistent reporting. For example, the United Kingdom ratio is 0.97 deaths per resolution. That is due primarily to missing data. For some reason the recovery data for the United Kingdom ends after April 12 so there are many deaths and no recoveries.
For another example, New York recently added a large number of deaths that had previously not be attributed to COVID-19 because they had other direct causes. For example, some people became infected with COVID-19, which caused the pneumonia that eventually killed them.
The other graphs show that the United States, Italy, and Spain seem to be settling around the 35% area. Spain has a slightly lower rate at 21%.
Germany has a much lower rate of 5%. That could be due to Germany doing a much better job of testing so they are detecting more mild and asymptomatic cases than the other countries. If that is the case, then the true CFRs of the other countries could be similar, they just have many more cases going undetected.
One estimate (see this post) says that the known cases make up only around 6% of all cases. If that is true, then the instance fatality rate (IFR, the ratio of deaths to all cases detected and not detected) is much lower than the apparent CFR.
That’s good news if you are not diagnosed with COVID-19. It’s still bad news if you are diagnosed. In a general sense, if you are not diagnosed but you are infected, then there is a very good chance that you will recover. If you are diagnosed, however, then there may be as much as a roughly 35% chance of severe problems.
The picture below, which was shown in the previous post, shows COVID-19 deaths per million in the Scandinavian countries.
In this picture Sweden has the most fatalities per million. Denmark has roughly half as many fatalities per million and Norway has fewer than a quarter as many.
However the following picture, which shows deaths per resolution, shows a slightly different picture.
Here Sweden still has the highest rate, but Norway has even more deaths per resolution. If you use the program look at the other graphs, you’ll find that Norway has a relatively small number of deaths and recoveries per million, but a high number of cases per million. I think the issue is that Norway has a relatively large number of unresolved cases so the graph is still fluctuating wildly. I think as more cases are resolved, its curve will settle down, hopefully at a lower level.
All of these numbers will change as more cases are resolved, but this program gives you a tool to track where outcomes may be headed. Testing remains inconsistent so it’s hard to pin down absolute numbers. Worldwide there have currently been 149,860 deaths and 560,672 recoveries, so the deaths-to-resolutions ratio is 149,860 / (560,672 + 149,860) = 21%.
I wish I had more data. There are two pieces of data that I would most like to have right now.
First, case-by-case data so we can group data into cohorts. In other words, I would like to see the outcomes (death, recovery, or unresolved) for cases that started on any particular date. That would let us calculate more accurate CFRs.
Second, population infection rates. In other words, when all is said and done, what percentage of a population will become infected, either obviously, mildly, or asymptomatically. (See that post I mentioned earlier.) If we can estimate the CFR and we know this infection rate, then we can estimate the total number of people who are likely to die.
One good thing about all of this is that the COVID-19 virus is not smart (despite what President Trump may say) but scientists are. That means the virus cannot change they way it works but we can change the way we respond to it. Researchers are working feverishly (so to speak) to develop new vaccines and treatments to prevent or lessen the affects of COVID-19. If we can slow the spread of the virus long enough, we will be able to fight it more effectively, lower the CFR, and save lives.
As always, download the example to experiment with it. If you learn anything interesting, mention it in the comments below.