Published on in Vol 6, No 2 (2020): Apr-Jun

Preprints (earlier versions) of this paper are available at, first published .
Modeling COVID-19 Latent Prevalence to Assess a Public Health Intervention at a State and Regional Scale: Retrospective Cohort Study

Modeling COVID-19 Latent Prevalence to Assess a Public Health Intervention at a State and Regional Scale: Retrospective Cohort Study

Modeling COVID-19 Latent Prevalence to Assess a Public Health Intervention at a State and Regional Scale: Retrospective Cohort Study

Original Paper

Corresponding Author:

Philip J Turk, MSc, PhD

Center for Outcomes Research and Evaluation

Atrium Health

1300 Scott Ave, Office 124

Charlotte, NC, 28203

United States

Phone: 1 304 376 5377


Background: Emergence of the coronavirus disease (COVID-19) caught the world off guard and unprepared, initiating a global pandemic. In the absence of evidence, individual communities had to take timely action to reduce the rate of disease spread and avoid overburdening their health care systems. Although a few predictive models have been published to guide these decisions, most have not taken into account spatial differences and have included assumptions that do not match the local realities. Access to reliable information that is adapted to local context is critical for policy makers to make informed decisions during a rapidly evolving pandemic.

Objective: The goal of this study was to develop an adapted susceptible-infected-removed (SIR) model to predict the trajectory of the COVID-19 pandemic in North Carolina and the Charlotte Metropolitan Region, and to incorporate the effect of a public health intervention to reduce disease spread while accounting for unique regional features and imperfect detection.

Methods: Three SIR models were fit to infection prevalence data from North Carolina and the greater Charlotte Region and then rigorously compared. One of these models (SIR-int) accounted for a stay-at-home intervention and imperfect detection of COVID-19 cases. We computed longitudinal total estimates of the susceptible, infected, and removed compartments of both populations, along with other pandemic characteristics such as the basic reproduction number.

Results: Prior to March 26, disease spread was rapid at the pandemic onset with the Charlotte Region doubling time of 2.56 days (95% CI 2.11-3.25) and in North Carolina 2.94 days (95% CI 2.33-4.00). Subsequently, disease spread significantly slowed with doubling times increased in the Charlotte Region to 4.70 days (95% CI 3.77-6.22) and in North Carolina to 4.01 days (95% CI 3.43-4.83). Reflecting spatial differences, this deceleration favored the greater Charlotte Region compared to North Carolina as a whole. A comparison of the efficacy of intervention, defined as 1 – the hazard ratio of infection, gave 0.25 for North Carolina and 0.43 for the Charlotte Region. In addition, early in the pandemic, the initial basic SIR model had good fit to the data; however, as the pandemic and local conditions evolved, the SIR-int model emerged as the model with better fit.

Conclusions: Using local data and continuous attention to model adaptation, our findings have enabled policy makers, public health officials, and health systems to proactively plan capacity and evaluate the impact of a public health intervention. Our SIR-int model for estimated latent prevalence was reasonably flexible, highly accurate, and demonstrated efficacy of a stay-at-home order at both the state and regional level. Our results highlight the importance of incorporating local context into pandemic forecast modeling, as well as the need to remain vigilant and informed by the data as we enter into a critical period of the outbreak.

JMIR Public Health Surveill 2020;6(2):e19353



In December 2019, a novel coronavirus emerged in Wuhan, Hubei Province, China [1]. The pathogen causes a respiratory illness, now known as the coronavirus disease (COVID-19) [2,3]. From its original epicenter in Wuhan, the virus spread rapidly within 30 days to other parts of Mainland China and exported to other countries [4-8]. As of April 10, 2020, 210 countries and territories have reported 1,673,423 confirmed cases of COVID-19 and 101,526 deaths [9]. Due to the spread across multiple countries and the large number of people impacted, on March 11 the World Health Organization recognized the novel severe acute respiratory syndrome (SARS) coronavirus 2 as a pandemic that poses a major global public health threat [10,11].

Although the effects of the COVID-19 pandemic are experienced worldwide, many key health policy decisions designed to reduce transmissions are determined at national and regional levels. These critical policy decisions must be implemented quickly and evaluated continuously so they can be adapted to the local context, recognizing the clear effect that geography, community context, density, and social determinants of health have on COVID-19 outcomes. In North Carolina, the first COVID-19 case was reported on March 2, 2020, and cases increased to 3,963 total confirmed cases as of April 10 [12]. To slow the rapidly increasing transmission rate, within a few weeks after the first case was detected, North Carolina state officials promoted social distancing strategies (ie, deliberately increasing physical space), banned large social gatherings, and closed public schools and universities. Subsequently, a stay-at-home order, which only allows for essential travel outside the home, was issued in the southwestern part of the state by Mecklenburg County effective at 8 am, March 26, lasting through April 16 (since extended to April 29), while a statewide stay-at-home order was issued effective at 5 pm, March 30, lasting to April 29.

Because the COVID-19 landscape evolves rapidly due to the confluence of locally relevant factors, appropriate modeling using timely infection prevalence to drive decision making around containment, treatment, and resource planning is critical. Forecasting models are used to generate early warnings to identify how a pandemic might evolve. During the early stages of the COVID-19 pandemic, forecasting was frequently applied to predict national and international infection transmission trends [13,14]. Local communities and health systems turned to these national and international models for their own planning; however, the generalizability of such models to the local situation is limited and ignores important community-level population characteristics and transmission dynamics [3,15-17]. An objective of this study was to understand how spatial differences impact model results and their interpretation.

In response to the need for actionable data insights in our community and health system, investigators from the Atrium Health Center for Outcomes Research and Evaluation developed a series of COVID-19 forecasting models, which were used to guide Atrium Health’s initial proactive response to ensure sufficient capacity to treat the expected surge in patient care demands. In this study, we present an initial susceptible-infected-removed (SIR) epidemic model and its evolution to the susceptible-infected-removed-social distancing-detection rate (SIR-int) model. In this paper, we describe and compare these models, the spatial differences in a pandemic, the significance of observed cases versus actual prevalence in the setting of rapidly evolving testing strategies, the current epidemiological trends, and the potential effects of nonpharmaceutical interventions applied locally (eg, social distancing).

The observed cumulative case and death counts were obtained daily at noon starting March 2, 2020, when the first COVID-19 case was reported, from the North Carolina Department of Health and Human Services website for all 100 counties [12]. Data collection for this manuscript ended on April 7, just prior to submission. To accurately estimate the actual latent prevalence at time t, the cumulative case counts were adjusted for imperfect detection by dividing them by 0.14. Although estimates of detection probability for the coronavirus, also known as the ascertainment rate, vary in the literature, ours is in line with those reported [18-22]. Modeling only the observed prevalence will give an inaccurate timeline of pandemic behavior. Cumulative deaths were then subtracted from adjusted cumulative cases. We also adjusted cumulative cases for recoveries by removing cases after 20 days, the estimated median duration of viral shedding from illness onset [23]. Daily incremental incidence was obtained by subtracting the estimated latent prevalence at time t – 1 from that at time t. Crucially, in our research, we model estimated latent prevalence as constructed here, not observed prevalence. For the sake of brevity moving forward, we use the terms “latent prevalence” and “prevalence” interchangeably.

In addition to North Carolina, interest also lay in the subpopulation served by Atrium Health’s greater Charlotte market. For convenience, we make use of a designation of a group of counties that constitute the greater Charlotte area used by the North Carolina Department of Health and Human Services. Specifically, the US Centers for Disease Control and Prevention’s Cities Readiness Initiative (CRI) is a federally funded program designed to enhance preparedness in the nation’s largest population centers to rapidly and effectively respond to large public health emergencies such as an act of bioterrorism. This also allowed us to model on a region that harmonized with the state’s approach to disaster planning in case statewide coordination of resources would be required. Within North Carolina, 11 counties are grouped into a CRI region that includes Anson, Cabarrus, Catawba, Cleveland, Gaston, Iredell, Lincoln, Mecklenburg, Rowan, Stanly, and Union. Collectively, we henceforth refer to these counties as “the CRI” (Figure 1). Because the CRI closely mirrors the large area served by Atrium Health’s greater Charlotte market, we used this population base for our local modeling efforts. The CRI includes over 2.5 million residents (24% of the North Carolina population) and ranges from rural settings like Anson County to Mecklenburg County, which contains North Carolina’s largest city, Charlotte [24]. To understand how spatial differences impact model results and their interpretation, we compared the CRI to North Carolina throughout the early phases of this pandemic.

We introduce the SIR deterministic compartmental model originally described by Kermack and McKendrick [25] and depict it in Figure 2.

Figure 1. Map of North Carolina showing the Cities Readiness Initiative region.
View this figure
Figure 2. Susceptible-infected-removed model diagram showing compartments and flow. I: infected; R: removed; S: susceptible.
View this figure

S is the number of individuals that are susceptible to infection in the population; I is the number of individuals that are infected; R is the number of individuals that are removed from the population via recovery and subsequent immunity or death from infection. This mutually exclusive and exhaustive partition is such that S + I + R = N, where N is the closed population size. We further assume all uninfected individuals are susceptible to infection. The transition flow is described by the arrows in the figure labeled with two rates. The parameter β is the infection rate and can be further decomposed as the product of the probability of transmission per contact and the rate of contact per person per unit time. γ is the removal rate.

More formally, the SIR model is a system of three ordinary differential equations (ODEs) involving two unknown parameters.

Note that all of S, I, and R, and their derivatives are functions of time t, such as S = S(t), although we do not denote this notationally here. By how the model is constructed, the first equation in the system returns a number less than or equal to zero, the second equation returns any real number, and the third equation returns a number greater than or equal to zero.

All data analysis was done using R statistical software, version 3.6.2 (R Foundation for Statistical Computing). As described in Churches [26], we used the ode() default solver from the desolve package to solve the system of ODEs defining the SIR model. Next, we used a quasi-Newton method with constraints to find the optimal values for β and γ on (0, 1) by minimizing the square root of the sum of the squared differences between I, which is our prevalence, and its prediction Î over all time t [27]. To establish initial conditions for model fitting, we estimate the population size of North Carolina and the CRI to be 10,488,084 and 2,544,041, respectively, using information taken from census estimates [24]. After obtaining the estimates and , to help assess model goodness-of-fit, we define the following statistic:

Time is indexed from i 1, ... , n, and n is the number of prevalences in the sample. Note that Ī is the average of the Ii’s.

To compare different scenarios for both North Carolina and the CRI, we define an SIR model (SIR-pre) fit to the data from the time of the outbreak until the time of the March 26, 2020, Mecklenburg County stay-at-home order. Since Mecklenburg County is the state’s second largest county, this could have a strong effect on the pandemic trajectory, both in the CRI and the state; therefore, we have used this date to delineate the date of the significant public health intervention. We further define an SIR model (SIR-post) fit to the data from the time of the outbreak until the end of data collection.

Given the major public health intervention implemented on March 26, 2020, we modified the SIR model for both the CRI and North Carolina to accommodate this (denoted SIR-int). SIR models with interventions can be simulated using the EpiModel package. This package provides tools for building, simulating, and analyzing several classes of models for the population dynamics of infectious disease transmission in epidemics. These include not only deterministic compartmental models, but stochastic individual contact models and network models. We first fitted the SIR model as before to the data up until March 26 and extracted the estimates of β and γ. After March 26, we retained the removal rate but modified the infection rate. We set the preintervention probability of transmission equal to 0.015, which is consistent with other viral infectious diseases like SARS and AIDS [28,29]. We then set the rate of contact so that the probability of transmission multiplied by the rate of contact equaled . To simulate the observed intervention, using the default fourth-order Runge-Kutta Method (RK4) ODE solver, we affected the probability of transmission by iteratively decreasing the hazard ratio of infection, given exposure to the intervention (step size of 0.0001) compared to no exposure, until the fitted infection curve yielded a maximum .

For exploratory data analysis, we generated time plots for prevalence, incidence, and both daily and cumulative deaths. The basic reproduction number R0 is the average number of secondary cases of disease caused by a single infected individual over his or her infectious period in a population where all individuals are susceptible to infection. To estimate R0, we compute:

and are estimates taken from the model fit. Since the SIR model is fully parameterized by β and γ, we also obtain predictions and over all time t. The percentage of infected at peak prevalence was computed by dividing the maximum Î by the population size N, while the final percentage of infected was computed as the limit 1 − (∞)/N. To estimate doubling time and compute a 95% confidence interval, we modeled incidence growth by fitting a loglinear model as a function of time t using the incidence package.

Figure 3 shows time plots of prevalence, cumulative deaths, incidence, and daily deaths for North Carolina from the start of the outbreak on March 2 up to and including April 7, 2020. The first death was recorded in North Carolina on March 24.

Figure 4 shows time plots of prevalence, cumulative deaths, incidence, and daily deaths for the CRI from the start of the outbreak on March 11 up to and including April 7, 2020. The first death was recorded in the CRI on March 25.

Notably, the prevalence and cumulative death curves for both figures look exponential. Although both incidence curves are increasing, the incidence curves become volatile after the stay-at-home order went into effect. Prior to March 26, 2020, doubling time was estimated to be 2.56 days in the CRI (95% CI 2.11-3.25) and 2.94 days in North Carolina (95% CI 2.33-4.00). Once data after March 26 were included, the doubling times increased and were estimated to be 4.70 days in the CRI (95% CI 3.77-6.22) and 4.01 days in North Carolina (95% CI 3.43-4.83).

Figure 3. Time plots for NC. COVID-19: coronavirus disease; NC: North Carolina.
View this figure
Figure 4. Time plots for the CRI. COVID-19: coronavirus disease; CRI: Cities Readiness Initiative.
View this figure

Tables 1 and 2 gives a synopsis of the model fits for each location and model type. The estimated R0 of 2.36 for the CRI prior to March 26, 2020, is more typical of the range of R0 values given in the literature for COVID-19, while the value of 1.79 for North Carolina is substantially lower [5,30]. After the intervention, the estimated R0 values for both locations drop to a similar value, although this result was affected by a reduced model fit. A comparison of the efficacy of intervention, defined as 1 – the hazard ratio of infection, gives 0.25 for North Carolina and 0.43 for the CRI. Using these hazard ratios to compute estimates of R0 from March 26 onward (0, post), we derive 1.34 and 1.33 for North Carolina and the CRI, respectively. This suggests that the COVID-19 outbreak is rapidly decelerating in North Carolina and the CRI after the aggressive public health intervention.

Table 1. Summary table of model fit for SIR-pre and SIR-post models in NC and the CRI.

aNC: North Carolina.

bSIR: susceptible-infected-removed.

cCRI: Cities Readiness Initiative.

Table 2. Summary table of model fit for susceptible-infected-removed–int model in NC and the CRI.
LocationHazard ratio0,post

aNC: North Carolina.

bCRI: Cities Readiness Initiative.

Figures 5 and 6 show plots of the three fitted models’ infection curves for North Carolina and the CRI, respectively, out to April 7, 2020. The behavior in the two plots is the same. The SIR-post model clearly demonstrates a lack-of-fit to the data. For the SIR-int model, we noted the hinge point induces a change of behavior from March 26 onward. The dotted orange line represents the SIR-pre forecast projections from March 26 onward. They are much larger than the actual data.

Figure 5. Infection prevalence prediction curves for NC up to April 7, 2020. COVID-19: coronavirus disease; NC: North Carolina; SIR: susceptible-infected-removed.
View this figure
Figure 6. Infection prevalence prediction curves for the CRI up to April 7, 2020. COVID-19: coronavirus disease; CRI: Cities Readiness Initiative; SIR: susceptible-infected-removed.
View this figure

Figures 7 and 8 show plots of the three fitted models’ infection curves for North Carolina and the CRI, respectively, projected out to the beginning of August. In both plots, we see the dramatic effect of the public health intervention; that is, the so-called “flattening of the curve.” There are two important differences to note between North Carolina and the CRI region. First, the CRI visibly shows relatively more flattening. This effect can be best observed in Table 3 in the peak infected and final infected columns. Moving from the pre to post to int models within a location, the drop in percentage infected is more pronounced in the CRI. In fact, for the SIR-int model, the percentages are virtually the same for both locations; that is, the CRI has “slowed down” to the state as a whole. Second, the date of peak prevalence was initially 8 days earlier for the CRI compared to North Carolina. However, using the current SIR-int model, although both locations showed their infection curves shifting forward in time, the date of peak prevalence is now 3 days later in the CRI (Table 3). To put this into context, for North Carolina, the time duration from the start of the outbreak to the peak prevalence has gone from 49 days to 70 days (43% increase). However, for the CRI, the time duration from the start of the outbreak to the peak prevalence has gone from 32 days to 64 days (100% increase).

Figures 9 and 10 show plots of the three fitted models’ removal curves for North Carolina and the CRI, respectively, projected out to the beginning of August. These plots support what we have observed so far. With the continued intervention, the removal curves are beginning to collapse, which is a behavior we would expect. For the SIR-int model, both locations show a removal plateau being reached roughly around the beginning of July.

Figure 7. Infection prevalence prediction curves for NC up to August 1, 2020. COVID-19: coronavirus disease; NC: North Carolina; SIR: susceptible-infected-removed.
View this figure
Figure 8. Infection prevalence prediction curves for the CRI up to August 1, 2020. COVID-19: coronavirus disease; CRI: Cities Readiness Initiative; SIR: susceptible-infected-removed.
View this figure
Table 3. Summary table describing infection under three different models in NC and the CRI.
Peak KineticsFinal infected, n (%)
LocationModel2020 dateÎPeak infected, %
NCa (n=10,488,084)SIRb-preApr 205,673,2701,213,1903,601,625127,639,271 (73)
NC (n=10,488,084)SIR-postApr 286,614,437866,4043,007,24486,776,491 (65)
NC (n=10,488,084)SIR-intMay 117,913,011366,0372,209,03734,798,450 (46)
CRIc (n=2,544,041)SIR-preApr 121,142,320537,031864,690212,217,696 (87)
CRI (n=2,544,041)SIR-postApr 241,488,530282,257773,254111,826,953 (72)
CRI (n=2,544,041)SIR-intMay 141,911,34389,324543,37441,163,824 (46)

aNC: North Carolina.

bSIR: susceptible-infected-removed.

cCRI: Cities Readiness Initiative.

Figure 9. Removal prevalence prediction curves for NC up to August 1, 2020. COVID-19: coronavirus disease; NC: North Carolina; SIR: susceptible-infected-removed.
View this figure
Figure 10. Removal prevalence prediction curves for the CRI up to August 1, 2020. COVID-19: coronavirus disease; CRI: Cities Readiness Initiative; SRI: susceptible-infected-removed.
View this figure

Principal Results

In terms of model fitting, we state several observations. The SIR-pre model represents a “worst case” scenario, as if the disease were allowed to run its course. Hence, early in a pandemic like this, it serves a useful purpose to help leaders understand the consequences of taking no action, or delayed action on implementing public health interventions. Beyond that, a basic SIR model, especially one that is used after being fit only to early pandemic data, imparts no further value for informing pandemic response planning, and indeed may provide errant forecasts. This diminished value also holds true when a basic SIR model is fit to contemporary data, yet ignores the effect of a public health intervention, as demonstrated by the SIR-post model. Eventually, both such models will provide a poor fit to the data. Because the behavior of any epidemic is dynamic, any model requires constant monitoring, assessment of fit to local data, and evaluation of efficacy as new data are collected or additional research becomes available. Our SIR-int model provides an example where this attention to model fit and incorporation of regional influences allows for appropriate model adaption and careful calibration thus generating the most accurate predictions available to guide regional decision making at the time.

Summarizing the effect of the intervention, the doubling time for both locations is substantially slower after the intervention, with the CRI doubling time estimate (4.70 days) now being greater than North Carolina (4.01 days). The stay-at-home orders strongly appear to be working as intended as the infection curves for both locations are now becoming flatter (and shrinking), with peak infection prevalence now being pushed towards mid-May, both location’s recovery curves starting to fall, and measurable intervention effects on the hazard ratio and R0. It is interesting to note that our results match rigorous Monte Carlo simulation studies we conducted weeks beforehand.

If we compare the two locations, the estimated R0 of 2.36 for the CRI prior to March 26, 2020, is more typical of the range of R0 values in the literature for COVID-19, while the value of 1.79 for North Carolina is substantially lower. This could be attributed to the fact that the CRI contains the largest city in North Carolina, and one of the United States’ busiest airports, setting the stage for this region to have become another COVID-19 hot spot. It is interesting to note that the North Carolina SIR-int model showed a better fit when the changepoint was also set to March 26, rather than March 30 when the statewide stay-at-home order went into place. One possible explanation for this could be that as the pandemic began in earnest, the general population’s fear of the virus also increased, perhaps causing most North Carolina citizens to shelter-in-place prior to the order going into effect. Another explanation is that Mecklenburg County accounts for almost 11% of the North Carolina population and so the effect of the county order directly impacted adjoining counties in the CRI, thus influencing the observed effect at the state level. Two additional interesting observations highlight the critical influence of spatial variation. First, the CRI infection curve evidences relatively more flattening and a later peak infection date. Second, the intervention effect in the CRI also appears stronger. The likely explanations for these differences are the Mecklenburg County stay-at-home policy going into effect 5 days before the state order, the different reaction of the local population to the order and its related messaging, and innumerable other unknown covariates such as early canceling of religious services, public gathering policies, and canceling of elective medical visits and procedures.


There are limitations to the SIR model. Some take issue with its deterministic form, although one could fit a Bayesian SIR model to make it stochastic. Perhaps the biggest limitation is that β and γ could be time-varying due to different forms of intervention (enhanced personal protective measures and social distancing). However, as we have shown in this paper, we can easily leverage pre-existing R functions to incorporate a changepoint that modifies the probability of transmission to acknowledge an important public health intervention. It is also possible to customize the SIR model within R to define more advanced and different transition processes, and then parameterize and simulate those models to accommodate insights from additional research. In this way, one can also examine “what if” scenarios or assess model robustness through sensitivity analysis. The SIR model is simple to understand and easier to fit, as opposed to other deterministic compartmental models, such as SEIR, or stochastic individual contact models [31]. However, these more advanced models will play an increasingly important role in forecasting and understanding the dynamics of this evolving pandemic.

The lack of widespread COVID-19 testing, both for symptomatic and asymptomatic individuals, presents a major limitation of unknown scale and implications to forecasting models [32,33]. Data sources are known to undercount cases, only include asymptomatic illness by chance, and define cases inconsistently based on variable testing criteria between and within geographies. Collectively, these contribute to imperfect detection. As a result, high-level models may not comprehend the full extent of the outbreak, creating challenges in producing accurate forecasts. Our decision to base our modeling strategy on estimated latent prevalence addresses this inconsistency by adjusting observed prevalence counts. Modeling only the observed prevalence has the effect of shifting the SIR curves ahead in time by several days or more. Although our estimate of the detection probability (0.14) is heuristically motivated, a thorough search of the literature supports our use of this estimate as reasonable. Future work will focus on refining this estimate as new research appears and allowing it to vary as a function of time.

Comparison With Prior Work

Although there is a plethora of models that estimate the impact of COVID-19 in the United States, there are far fewer that give localized projections. We note that our mid-May date for the peak infection curve is roughly 3-4 weeks later than the projection from the often-cited model from the Institute for Health Metrics and Evaluation [34]. The latter uses a Bayesian generalized nonlinear mixed model to examine cumulative death rates and assumes a strict social distancing policy is in place. Using data up until March 13, 2020, Columbia University reported a mid-May peak time for North Carolina under no control measures and a start of July peak time under some control measures [35]. The authors caution that their metapopulation SEIR model is designed to capture national trends, and local projections should be viewed as broad estimates. Other models, such as the CHIME model from the University of Pennsylvania Health System, relied on data from three Pennsylvania hospitals to estimate hospital capacity and clinical demand and was not designed to capture changing regional mitigation strategies [36].

Policy and Practice Implications

In the context of limited national policy guidelines to reduce COVID-19 transmission, provide resources for health care system pandemic preparedness, and mitigate health consequences, state and local authorities must have reliable, timely, and geographically specific models to manage the unfolding crisis. We provided our local forecasts to health system leaders and public health officials to help guide regional planning. Because we regularly refit our models to local data, these served as a flexible tool enabling first proactive preparedness based on the initial pandemic trajectory, followed by timely pivoting of capacity planning to match the observed disease deceleration. Furthermore, locally accurate forecasts enhance the relevance of forecasting’s role in public health communication [37]. For example, the potential disease impact on local health system capacity may help communities understand the rationale for public health interventions, whereas the positive effects of community mitigation may provide reinforcement for maintaining strategies like social distancing and enhanced hygiene.

Using regional and state data, we demonstrate how epidemiological modeling based on local context is critically important to informing pandemic preparedness for health systems and policy leaders. The results highlight the importance for such models to be created using local data, as opposed to running a simulation that makes many assumptions about the truth of parameter values. All models should be continuously recalibrated and adapted to the rapid, continuously changing situations inherent to a pandemic. A one-size-fits-all approach to the underpinning forecasting model or reliance on data that does not incorporate local context, sets the stage for misguided forecasting. Additionally, our study shows that, although a classic SIR model may perform well in the early days of the pandemic, it begins to lose relevance with the emergence of additional influences like social distancing and enhanced awareness of personal hygiene.

The SIR-int model has high predictive accuracy based on data collected from March 2 to April 7, 2020, for both North Carolina and the CRI and is able to demonstrate clear, compelling evidence of the efficacy of a stay-at-home order. By modeling estimated latent prevalence as we have done in this paper, instead of observed prevalence, a lag delay in projecting peak infection can be avoided, reducing the consequences to leaders who require an accurate timeline for planning purposes (eg, surge planning of hospital beds, supplies, and personnel).


All other things being equal, if residents continue to observe the stay-at-home orders, maintain attention to social distancing, and increase personal hygiene, then this wave of the COVID-19 outbreak would essentially be over by mid-July. It is possible that we could see continued flattening and shrinking of the infection curve in which case our forecast results would adapt commensurately. It is also possible that infection prevalence could oscillate at a low level over time, in which case more advanced modeling and methods would be needed. Our results highlight the importance of incorporating local context into pandemic forecast modeling, as well as the need to remain vigilant and informed by the data as we enter into a critical period of the outbreak. Although there will regrettably still be tragic loss of life and many North Carolina citizens infected by the coronavirus, this scenario pales in comparison to what could have been a far worse conclusion.


PT takes full responsibility for the integrity and accuracy of the statistical analysis and assembling the manuscript; SHC generated the map and was involved in discussions on statistical analysis; YT and MS drafted the abstract and discussion; MK and JP drafted the introduction; PT drafted the methods, results, and part of the discussion; MK, MS, YT, and PT assembled references; YT, SHC, MS, and AW performed critical revision of the manuscript for important intellectual content; PP collected the data and drafted the bibliography; and AW supervised the study.

Conflicts of Interest

None declared.

  1. Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med 2020 Feb 20;382(8):727-733. [CrossRef]
  2. Guan W, Ni Z, Hu Y, Liang W, Ou C, He J, China Medical Treatment Expert Group for Covid-19. Clinical characteristics of coronavirus disease 2019 in China. N Engl J Med 2020 Apr 30;382(18):1708-1720 [FREE Full text] [CrossRef] [Medline]
  3. Lai C, Shih T, Ko W, Tang H, Hsueh P. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges. Int J Antimicrob Agents 2020 Mar;55(3):105924 [FREE Full text] [CrossRef] [Medline]
  4. Wu Z, McGoogan JM. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72 314 cases from the Chinese Center for Disease Control and Prevention. JAMA 2020 Feb 24. [CrossRef] [Medline]
  5. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med 2020 Mar 26;382(13):1199-1207 [FREE Full text] [CrossRef] [Medline]
  6. Remuzzi A, Remuzzi G. COVID-19 and Italy: what next? Lancet 2020 Apr;395(10231):1225-1228. [CrossRef]
  7. Holshue ML, DeBolt C, Lindquist S, Lofy KH, Wiesman J, Bruce H, Washington State 2019-nCoV Case Investigation Team. First case of 2019 novel coronavirus in the United States. N Engl J Med 2020 Mar 05;382(10):929-936 [FREE Full text] [CrossRef] [Medline]
  8. Arentz M, Yim E, Klaff L, Lokhandwala S, Riedo FX, Chong M, et al. Characteristics and outcomes of 21 critically ill patients with COVID-19 in Washington State. JAMA 2020 Mar 19 [FREE Full text] [CrossRef] [Medline]
  9. World Health Organization. 2020 Mar 23. Coronavirus disease 2019 (COVID-19): situation report - 63   URL: https:/​/www.​​docs/​default-source/​coronaviruse/​situation-reports/​20200323-sitrep-63-covid-19.​pdf?sfvrsn=d97cb6dd_2.​)
  10. Onder G, Rezza G, Brusaferro S. Case-fatality rate and characteristics of patients dying in relation to COVID-19 in Italy. JAMA 2020 Mar 23. [CrossRef] [Medline]
  11. Sohrabi C, Alsafi Z, O'Neill N, Khan M, Kerwan A, Al-Jabir A, et al. World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int J Surg 2020 Apr;76:71-76 [FREE Full text] [CrossRef] [Medline]
  12. NC Department of Health and Human Services website. Coronavirus disease 2019 (COVID-19) in North Carolina   URL: [accessed 2020-03-25]
  13. Ferguson N, Laydon D, Nedjati-Gilani G, Imai N, Ainslie K, Baguelin M, et al. Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand. Imperial College London 2020. [CrossRef]
  14. Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 2020 Feb 29;395(10225):689-697 [FREE Full text] [CrossRef] [Medline]
  15. McCloskey B, Heymann DL. SARS to novel coronavirus - old lessons and new lessons. Epidemiol Infect 2020 Feb 05;148:e22 [FREE Full text] [CrossRef] [Medline]
  16. Du Z, Wang L, Cauchemez S, Xu X, Wang X, Cowling BJ, et al. Risk for transportation of coronavirus disease from Wuhan to other cities in China. Emerg Infect Dis 2020 May;26(5):1049-1052. [CrossRef] [Medline]
  17. Wilder-Smith A, Freedman DO. Isolation, quarantine, social distancing and community containment: pivotal role for old-style public health measures in the novel coronavirus (2019-nCoV) outbreak. J Travel Med 2020 Mar 13;27(2) [FREE Full text] [CrossRef] [Medline]
  18. Read J, Bridgen J, Cummings D, Ho A, Jewell C. Novel coronavirus 2019-nCoV: early estimation of epidemiological parameters and epidemic predictions. medRxiv 2020. [CrossRef]
  19. Verity R, Okell L, Dorigatti I, Winskill P, Whittaker C, Imai N, et al. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect Dis 2020 Mar 20. [CrossRef]
  20. Nishiura H, Kobayashi T, Yang Y, Hayashi K, Miyama T, Kinoshita R, et al. The rate of underascertainment of novel coronavirus (2019-nCoV) infection: estimation using Japanese passengers data on evacuation flights. J Clin Med 2020 Feb 04;9(2). [CrossRef] [Medline]
  21. Wu J, Leung K, Bushman M, Kishore N, Niehus R, De SP, et al. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat Med 2020 Mar 19:506-510. [CrossRef]
  22. Niehus R, De SP, Taylor A, Lipsitch M. Quantifying bias of COVID-19 prevalence and severity estimates in Wuhan, China that depend on reported cases in international travelers. medRxiv 2020. [CrossRef]
  23. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet 2020 Mar 09;395(10229):1054-1062. [CrossRef]
  24. NC Budget and Management.   URL:
  25. Kermack WO, McKendrick AG. A contribution to the mathematical theory of epidemics. Proc R Soc Lond A 1927 Aug 01;115(772):700-721. [CrossRef]
  26. Tim Churches Health Data Science Blog. 2020 Feb 18. Analysing Covid-19 (2019-nCoV) Outbreak Data with R   URL: [accessed 2020-05-23]
  27. Zhu C, Byrd R, Lu P, Nocedal J. A limited memory algorithm for bound constrained optimisation. SIAM J Sci Stat Comput 1995;16(5):1190-1208. [CrossRef]
  28. Patel P, Borkowf CB, Brooks JT, Lasry A, Lansky A, Mermin J. Estimating per-act HIV transmission risk: a systematic review. AIDS 2014 Jun 19;28(10):1509-1519 [FREE Full text] [CrossRef] [Medline]
  29. Pitzer V, Leung G, Lipsitch M. Estimating variability in the transmission of severe acute respiratory syndrome to household contacts in Hong Kong, China. Am J Epidemiol 2007 Aug 01;166(3):355-363 [FREE Full text] [CrossRef] [Medline]
  30. Zhang S, Diao M, Yu W, Pei L, Lin Z, Chen D. Estimation of the reproductive number of novel coronavirus (COVID-19) and the probable outbreak size on the Diamond Princess cruise ship: A data-driven analysis. Int J Infect Dis 2020 Apr;93:201-204. [CrossRef] [Medline]
  31. Churches T, Jorm L. COVOID: a flexible, freely available stochastic individual contact model for exploring COVID-19 intervention and control strategies. JMIR Preprints 2020. [CrossRef]
  32. Vetter P, Eckerle I, Kaiser L. Covid-19: a puzzle with many missing pieces. BMJ 2020 Feb 19;368:m627. [CrossRef] [Medline]
  33. Sharfstein JM, Becker SJ, Mello MM. Diagnostic testing for the novel coronavirus. JAMA 2020 Mar 09. [CrossRef] [Medline]
  34. Murray C. Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilator-days and deaths by US state in the next 4 months. medRxiv 2020. [CrossRef]
  35. Glanz J, Leatherby L, Bloch M, Smith M, Buchanan L, Wu J, et al. The New York Times. 2020. Coronavirus could overwhelm U.S. without urgent action, estimates say   URL: [accessed 2020-05-23]
  36. Weissman GE, Crane-Droesch A, Chivers C, Luong T, Hanish A, Levy MZ, et al. Locally informed simulation to predict hospital capacity needs during the COVID-19 pandemic. Ann Intern Med 2020 Apr 07. [CrossRef]
  37. Lutz C, Huynh M, Schroeder M, Anyatonwu S, Dahlgren FS, Danyluk G, et al. Applying infectious disease forecasting to public health: a path forward using influenza forecasting examples. BMC Public Health 2019 Dec 10;19(1):1659. [CrossRef] [Medline]

COVID-19: coronavirus disease
CRI: Cities Readiness Initiative
ODE: ordinary differential equations
RK4: fourth-order Runge-Kutta method
SARS: severe acute respiratory syndrome
SIR: susceptible-infected-removed

Edited by T Sanchez; submitted 14.04.20; peer-reviewed by A Sudaryanto, C Basch; comments to author 06.05.20; revised version received 16.05.20; accepted 18.05.20; published 19.06.20


©Philip J Turk, Shih-Hsiung Chou, Marc A Kowalkowski, Pooja P Palmer, Jennifer S Priem, Melanie D Spencer, Yhenneko J Taylor, Andrew D McWilliams. Originally published in JMIR Public Health and Surveillance (, 19.06.2020.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.