GA COVID-19 Report July 3, 2020

I feel we cannot overuse this meme right
now.

Note on Format

Today’s report is my first time posting to Medium, rather than just self-hosting the report as a static page on bitbucket. This should allow for a more “blog-like” format, and make it easier to share and access the report (thanks to @ReesMorrison for the suggestion)! I’ve retooled the report to post to Medium automatically using the R package ‘mediumr’. Unfortunately this did require some refactoring of the original code-based used to generate reports, with the primary loss being that my interactive graphs that were created with ggplotly don’t seem to want to cooperate. As a result all the graphs here are static. If you want the interactive graphs, you can still get them at my daily upload on bitbucket, which can be found at https://asb12f.bitbucket.io/COVID19/7-3.html (for prior dates, just change the date at the end of the link).

If you know how to get plotly and mediumr to play together nicely, please let me know!

Daily Summary & Notes

Today’s report uses the data from the 2:50PM Report from the GA Department of Public Health

Today we saw 2784 new cases (our record for new cases is 3472), which brings us to 17498 in the past 7 days (19.3% of total cases so far). We also had 7 new deaths (our record for new deaths 100), which brings us to 86 in the past 7 days (3% of total deaths so far). We saw 153 new hospitalizations (our record is 442), bringing our 7-day count to 1048 (9% of total hospitalizations so far). Lastly, we had 24 new ICU admissions (54 is the record), bringing our 7-day count to 169 (7% of total ICU admissions cases so far).

For testing, we saw 26978 new COVID19 tests, bringing us to 139189 in the past 7 days (15.3% of total COVID19 tests so far). We also saw 2929 new antibody tests, bringing us to 15326 in the past 7 days (9.3% of total antibody tests so far).

Today is our 3rd higest case increase on record, which is particularly alarming when you consider that 3 days ago it would have been our largest increase by nearly 600 cases. I don’t care how much you love fireworks, STAY HOME.

Data

Data Notes

Prior to 5/11, all data is taken from the noonish update from the GA Department of Public Health to present even time intervals between data points which is important for graph interpretation. On 5/11, reporting schedule shifts to being at 9AM, 1PM, and 7PM, so this report will capture to the 1PM reporting time. On June 2nd, reporting was reduced to once a day at 3PM. Data does reflect multiple inefficiencies and inaccuracies in the current reporting system, including showing tests before their results are returned, delays in reporting on weekends that create artificial spikes and valleys in change data. In general, interpretation should examine the general trends, and not focus exclusively on endpoint trajectories, which are highly influenceable by these data variations.

To help visualize the effects of State actions on the outbreak, I’ve added a few sets of lines to several of the graphs. The first — the vertical blue lines — show when the state of emergency went into effect (3/15; solid line) and when we might expect to see first effects from it (dotted line). The second — vertical red lines — is the Friday Shelter in Place was instituted (4/3; solid line) and the date we might expect to see first effects (dotted line). The third — vertical pink lines — show when the shelter in place was lifted (4/30; solid line) and the date we might expect to see first effects (dotted line).

Where point data is presented, a LOESS regression with 95% confidence intervals is shown to help the viewer interpret overall trends in the data. This is preferred over a line graph connecting all points, which tends to over-emphasize outliers in report.

Cumulative Confirmed Cases

Cumulative Hospitalizations

Cumulative Deaths

Cumulative ICU Use

Change Patterns

Count Level Tracking

Z Score Fluctuations

Because percentage growth becomes misleading over time, I’ve added a floating 4-week Z-score visualization for each measure to help put into perspective the magnitude of daily variation in numbers.

For those who don’t spend a lot of time in the world of statistics, a Z score is a measure that describes the relationship of an observation (in this case, a particular day’s number) to the average across the entire group. It is calculated by taking the difference between the observation and the mean, and dividing by standard deviation.

\[ Z = (O — Mean) / SD \]

For example, if the mean score for a group is 50, and the standard deviation is 10, then a score of 60 woud have a Z score of (60–50) / 10 = 1, and a score of 20 would have a Z score of (20/50) / 10 = -3.

This can be useful in identifying patterns in data reporting, and help put daily fluctuations in perspective. Because the data is more localized, it doesn’t fall victim to the diminishing returns effect. These visualizations are limited to the data from the last 30 days, which further helps illustrate trends and fluctuations.

New Cases

For today’s cases, the 30-day mean is 1386.63 and the standard deviation is 771.29.

Hospitalizations

For today’s hospitalizations, the 30-day mean is 107.8 and the standard deviation is 60.68.

Deaths

For today’s deaths, the 30-day mean is 24.43 and the standard deviation is 18.53.

ICU Admissions

For today’s ICU Admissions, the 30-day mean is 19.07 and the standard deviation is 9.77.

Testing

These graphs contain several markers that reflect the changing nature of the testing data that has been provided over time.

As of 4/28 specific counts of the number of tests administered by the government and commerical providers stopped being reported. Additionally, on this date we began to track data on the number of positive tests conducted by the CDC.

On 5/27, specific counts of serology tests (antibody tests) became available, which had previously been aggregated into the total test count. This date has been marked with a vertical gold line on the graphs. This distinction is important, as positive antibody tests do not result in new cases in the overall count, and thus both suppress the positive test rate and artificially inflate estimates of test prevalence. The daily data for daily COVID19 tests and serology tests is tracked starting on this date.

Cumulative Testing

Positive Tests by Source

Total Testing Trends

For today’s new tests, the 30-day mean is 16374.3 and the standard deviation is 5689.84.

COVID19 Molecular Testing Trends

For today’s new tests, the 30-day mean is 14250.17 and the standard deviation is 5877.7.

COVID19 Antibody Testing Trends

For today’s new tests, the 30-day mean is 2424.13 and the standard deviation is 2334.72.

Is Increased Testing Causing Increased Cases?

A popular talking point recently is that the increase in cases that are being detected is not reflective of increased spread, but rather a result of increased testing. There is a certain logic to this — the more tests that are run the more potential cases we can identify. However, this can lead us to significant logical errors, and these in turn can lead to dangerous behaviors. While our data does not allow a perfect causal analysis, we can examine what associations between testing and cases exist in our data.

If we run a simple correlation between total number of tests and total number of cases, we get an initially persuasive graph. Note that this graph includes both antibody and molecular tests.

This gives a correlation of 0.98! This is inviting, but it mostly just shows that both of these numbers are increasing. This is potentially misleading because it looks at cumulative data. In fact, if we run a correlation between the total number of tests administered and a simple series of ascending numbers (1, 2, 3, etc.) we get a correlation of 0.97. Because our hypothesis (increased testing causes increases in reported cases) is more about fluctuations in these two variables than cumulative growth, we need a different analysis.

If we look at the daily increase in cases against the daily increase in tests, we get a different picture:

This gives us a correlation of 0.5384451. But this number is also misleading, because there are significant time lags in reporting of tests and new cases within the data.

To better assess the relationship, let’s look at 10-day moving averages for both new tests and new cases, and see what correlation exists between them. This will help balance out the issues of delayed results.

This gives us a correlation of 0.54. By the observational nature of our data, we can’t infer causation, and we can’t remove eliminate extraneous factors, but we can observe that the association between these two variables is limited, and that the increases in cases cannot be attributed purely or even primarily to the quantity of testing occuring.

Are We Flattening the Curve?

Reported cases are slowing — or rather they aren’t accelerating as much. We’ve shifted from an exponential growth rate to a more linear one — you can see that both in the overall cases graph and in in the “new cases per day” graph. In theory, this could be reflective of a potential flattening of the curve (as Dr. Carlos de Rio suggested back on 4/10.).

The problem in that analysis is that making inferences about the status of the virus in the population relies on us having quality measurement. Now, the measurement doesn’t have to be perfect — in fact, it never will be — but the errors in measurement need to be relatively random and not reflect any systematic bias. Our data struggles greatly with that standard; we know there are issues with the availability of tests, delays and accuracy in test outcome reporting, and systemic barriers that have shaped who has had access to the tests. By and large this means we only have data that’s representative of a few parts of the state, and that data is probably skewed away from the most vulnerable populations.

A further limiting factor is the relative unavailability of testing. The daily testing data has been very linear — we rarely break 4000 tests a day. If people aren’t being tested or can’t be tested, then we can’t get good data on those people’s infections. If testing doesn’t scale with the potential infection, then it artificially caps the growth rate. Without exponential increases in testing, it’s impossible for us to track exponential increases in infection.

What our testing data does show us, however, is that 20–25% of people who are tested are receiving positive tests, a stat that hasn’t shifted in weeks. When we factor in the other systematic issues in testing, it’s hard to read this as anything other than steady growth. I’ll be convinced that things are getting better when we see a sustained increase in testing with consistently falling positive test rates (that aren’t proportional to the testing increase).

Is the Shelter in Place Working?

The shelter in place order went into effect on April 3rd, which means that any potential effects of the order will only begin to be apparent in the week after April 17th, and only then if that data is adequately capturing the current status of infection in the state.

To help visualize the effects of State actions on the outbreak, I’ve added a few sets of lines to several of the graphs. The first — the vertical blue lines — show when the state of emergency went into effect (3/15; solid line) and when we might expect to see first effects from it (dotted line). The second — vertical red lines — is the Friday Shelter in Place was instituted (4/3; solid line) and the date we might expect to see first effects (dotted line). The third — vertical pink lines — show when the shelter in place was lifted (4/30; solid line) and the date we might expect to see first effects (dotted line).

Reviewing the data, we don’t see a dramatic shift in new cases per day. That’s likely because the orders were not well enforced — as we saw, many people disregarded shelter in place orders, and continued to have regular person-to-person contact, go shopping without protective devices, and generally failed to take precautions. In practice, I think what we’ve seen is that cases slowed — we shifted from an exponential growth pattern to a linear growth pattern — which is good but not enough to contain the situation. With the subsequent end of these orders, we should expect to see some gradual increases as people slowly begin to take on more risk.

Where is the Spike?

I’ve commented several times that I don’t think we’ll see a dramatic spike in cases as GA “re-opens”, not because contagiousness is a myth but because we failed to ever really close. Further, those Georgians who did take the efforts at mitigation seriously are generally continuing to do so. A much more probable pattern is a gradual increase in cases over time, but not the overnight “spike” that commentators keep talking about.

I do have to say I find the “Spike” discussion frustrating, because it pre-supposes that we ever saw a dramatic decline in cases, and that’s just not what the data shows. It’s much like the conversations about “Wave 2” in the Fall, which only make sense to ask if we ever stop “Wave 1”.

Is Herd Immunity A Viable Solution?

Here’s my understanding of all of this, based on what I’m seeing from various public health experts and background readings. It’s an area that can be complicated with well known diseases, and with the current Coronavirus we’re still figuring some of it out.

Herd immunity (and for that matter, a lot of infection control) is based around a metric called the basic reproduction number (abbreviated “R0”), which reflects the number of people who are likely to be infected by a single symptomatic carrier in a populations where everyone is susceptible (the default state in most infections). For example, an R0 of 3 means that each infected person in a population, on average, infects 3 other people. The higher R0 is, the more rapidly an illness spreads. Ideally, we want R0 to be less than 1, as this indicates that spread is slowing. Calculating R0 is complicated, because it reflects a product of the duration window that a person is contagious, the likelihood of infection when a susceptible person encounters the contagion (through people or otherwise), and the frequency of these contacts. This means that the R0 will vary between populations, and even on a day to day basis. Because of this, we only ever have an estimate of R0 in a population. An important related concept is the effective reproduction rate (abbreviated “R”), which is similar to R0 but reflects our estimate of spread when not all of the population is susceptible to infection. A good overview is here.

Herd immunity functions by reducing the number of susceptible people in the population, and is generally considered to be achieved when the proportion of immune people is high enough to push R under 1. We can roughly estimate the proportion of the population that needs to be immune for herd immunity (termed Vc) to go into effect using this equation:

\[ V_c = 1 — (1 / R_0) \]

For example, for measles, a common estimate of R0 in an urban population is around 18; thus we’d need about 95% of the population to be immune to have herd immunity. Estimates of COVID19 R0 have put it between 2.5 and 5, which suggests a need for between 60 and 80% of the population to develop immunity before heard immunity is achieved.

Unfortunately, in practice that simple equation doesn’t usually work. There are three significant factors to consider. First, estimating the proportion of the population that needs immunity depends on getting a good estimate of R0. Second, the immunity needs to be randomly distributed in the community. If there are clusters in the population that are systematically excluded from immunity, then it’s possible for the infection to overcome larger herd immunity (this is what we’re seeing with the measles outbreaks in anti-vaxxer communities). Lastly, this model depends on individuals achieving total immunity. When infection or vaccination only grants partial immunity, then we have to increase the total proportion of the population with partial immunity even further (represented as Vcp). This is generally calculated as

\[ V_{cp} = V_c/E \]

where E is the proportion of people who have immunity after vaccination. To illustrate, if only 75% of people develop a strong immunity after vaccination, then for a virus with an R0 of 2.5 we’d need 80% of people to be vaccinated (instead of 60% if we had total immunity). Unfortunately, if the R0 is 5, we’d need 106% of people vaccinated, which is impossible. In general, if E is larger than Vc, then immunization won’t be effective at containing the infection. A good primer on the epidemiology of herd immunity can be found here.

Bringing this all back to COVID19, we quickly run into some thorny issues. We’re still developing estimates for R0 for COVID19, which means that our best estimates of Vc are very speculative right now. We also have evidence that suggests infection does not confer immunity, or at best only partial total immunity. This means that even if we have 100% infection, we might not achieve herd immunity. Further, if the virus evolves at a rapid pace (which it seems to be doing), it’s likely that we may develop immunity to one strain only to be susceptible to others (which is a common problem with coronaviruses in general). Lastly, immunity acquired through infection isn’t randomly distributed in the way that herd immunity requires — it naturally creates segmentation. At minimum, this would mean that we’d need to increase immunity levels substantially above a baseline estimate. Taken together, it means that hoping we will all naturally develop immunity at a level sufficient to protect the community would be a slow process that would require extremely high rates of infection, and even then might not work. It’s probably only achievable through a universal vaccination campaign with a highly effective vaccine, or a well regulated series of boosters. Thankfully, there are people working frantically to develop such tools!

As a minor aside, it is worth talking about the death toll to achieve that level of infection and thus potential immunity. As of right now (10AM on 5/6/20), the CDC is reporting 1,171,510 cases in the US. If we estimate the true infection population is 10x that amount, then that means that we’re only at about 3.5% of the US population infected. If we scale this up linearly (which will be an underestimate), then at 60–80% infection rate we’re looking at around 1–2 million deaths. These numbers also do depend on us keeping the curve flattened, which means keeping measures to reduce spread rate in place; if we remove those preventative steps then it’s likely we’ll see renewed curve growth. That’s an expensive gamble for something that might not even work.

In the meantime, we do have a second avenue of intervention. Instead of reducing R, we can reduce R0. This is what our current public health interventions do. If we identify contagious people through systematic testing and contact tracing to isolate people during contagious windows, reduce the likelihood of infection during contacts through the use of PPEs and disinfectants, and reduce the number of contact events through sheltering in place and social distancing, we can disrupt the infection transmission and reduce R0. If we can sustain those interventions and push R0 below 1 for our population, we can successfully contain and beat the virus. This is exactly what has happened in China, South Korea, New Zealand, Vietnam, Taiwan and many other countries, where the current transmission R0 of the virus has essentially been reduced to 0. Unfortunately, this also requires real commitment to these measures; doing them as half measures (as much of the US has) means we might reduce R0 but not push it below 1, or sustain it there long enough to be meaningful.

Since we’re talking countries, we should talk about Sweden, because that’s where a lot of the Herd Immunity conversation is coming from. Officially, Sweden isn’t trying to do herd immunity, at least according to their foreign minister. They’re gambling on people electively reducing contact, and that their healthcare system (which is much more robust than ours) will be able to identify and isolate infections before they spread. They’re also gambling that having a generally healthier population will grant some protection. So far, the data doesn’t look good. Let’s compare Sweden to Georgia (as of 10AM on 5/6/20), since they have similar populations (about 10M). In GA, we have around 1300 deaths so far; Sweden is just under 3000. Interestingly, Sweden has detected fewer cases than GA has (23K to 29K); this suggests that their monitoring hasn’t been particularly effective. If we look at Case Fatality Rates, Sweden is at around 12%, while GA is only around 4% (we should treat this with some skepticism; we’re clearly undercounting in both places). Given that GA’s population has a much higher rate of “underlying conditions”, this discrepancy is alarming. Sweden also has the 7th highest death toll per capita, and has one of the fastest growing fatality rates in Europe. It’s not a winning strategy.

What About Using Common Sense?

I want to take a moment and talk about the “use common sense” talking point that’s been going around. It usually goes something like this “it’s all safe and fine, just use common sense and you won’t get sick”. Aside from the clear victim blaming elements, it fails to understand the way “common sense” works.

Common sense relies on people’s ability to accurately estimate the risk that a given activity poses, and moderate their behavior accordingly. That’s fine for many threats — when a stray dog growls and barks at you, you can easily assess the risk and avoid the harm. But it doesn’t work when threats aren’t highly visible, and when you’ve got bad heuristics about what constitutes a threat.The invisibility of the threat is multi-factored, which makes it almost impossible to use common sense to navigate.

First, the contagion is microscopic. This means you cannot see it in the air or on surfaces that you encounter. Even if you haven’t had any contact with another human being, there’s no way for you to know whether something you touch is contaminated. Thus, you can’t guess whether it’s safe to touch something out in public.

You also can’t tell who is infected. This is true both of the large number of asymptomatic but contagious people, and of the symptomatic who are simply masking their symptoms (good luck visually spotting the guy with the fever in line at Walmart). You can’t even tell for sure if you’re infected yourself. This means that avoiding infected people is impossible, short of avoiding all people. When you take these two items together, it becomes clear where many of the 6-foot rules don’t hold up well. If I’m potentially spreading contagion in a 6 foot moving bubble around my person, and you enter the area where I was just moments ago, the risk is still there. The only difference is that we can all pretend that we were following guidelines. This is why physical distancing doesn’t work in classrooms, homes, or other places people congregate.

Common sense also leads us to some unhelpful responses. Because we cannot see contagion, we substitute stimuli associated with it as signals. Hence, we treat people using PPEs as though they’re infected, we profile people based on their race and ethnicity, and we treat every pollen-related sneeze as a personal attack. And because this feels like doing something, we don’t attend to real risks.

In the absence of the ability to objectively assess immediate risk, we have to take a statistical approach — something that our brains are bad it. Since we can’t eliminate all possible exposure (hermits in well-supplied bunkers excluded), we have to take steps to minimize it instead. This is where we’ve failed — we’ve conflated “minimize” with “reduce”. We’ve done it because it’s convenient, and because our brains keep telling us “I don’t see any obvious threat, so it must be OK” — common sense is literally encouraging us to be unsafe.

In summary, common sense is why infection control is failing in GA, and across the US. Common sense is what makes liars pushing anti-vaxxer talking points and fake cures so profitable. Common sense has killed over 100K US so far. Let’s not use common sense about this.

Should People be Protesting?

While I do have concerns that we will see protest-related spread, that’s still a week away before appearing in the data. We’ve seen protestors taking precautions against spread (wearing masks, limiting crowd time, emphasizing open areas, etc.) as well as police tactics that have likely exacerbated spread (tear gas, holding people in confined areas with poor airflow, removal of PPEs from protestors, etc.). It’s impossible to calculate the level of risk present. However, the public health and medical communities are unified around one point on the topic — systemic racism is a public health issue that has immense human, financial, and moral costs, and must be addressed. Further, public health experts are consistent in the message that if people take adequate precautions, the risk in speaking out against racism is outweighed by the benefit. The two issues — COVID19 and racism — are in necessarily intertwined; to quote the open letter from infectious disease experts at the University of Washington: “protests against systemic racism, which fosters the disproportionate burden of COVID-19 on Black communities and also perpetuates police violence, must be supported.” This is why you’re seeing so many medical professionals on the front lines of these protests, and why the WHO supports them.

Final Thoughts

As always, I should point out that I am NOT an epidemiologist, and I will always defer to the experts.

Stay Home.

Wash Your Hands.

Wear a Mask.

Assume You’re Infected.

Documentation

Code and data available here.