Published on 16.11.20 in Vol 6, No 4 (2020): Oct-Dec
Preprints (earlier versions) of this paper are available at http://preprints.jmir.org/preprint/21168, first published Oct 15, 2020.
Reinfection with SARS-CoV-2: Discrete SIR (Susceptible, Infected, Recovered) Modeling Using Empirical Infection Data
Background: The novel coronavirus SARS-CoV-2, which causes the COVID-19 disease, has resulted in a global pandemic. Since its emergence in December 2019, the virus has infected millions of people, caused the deaths of hundreds of thousands, and resulted in incalculable social and economic damage. Understanding the infectivity and transmission dynamics of the virus is essential to determine how best to reduce mortality while ensuring minimal social restrictions on the lives of the general population. Anecdotal evidence is available, but detailed studies have not yet revealed whether infection with the virus results in immunity.
Objective: The objective of this study was to use mathematical modeling to investigate the reinfection frequency of COVID-19.
Methods: We have used the SIR (Susceptible, Infected, Recovered) framework and random processing based on empirical SARS-CoV-2 infection and fatality data from different regions to calculate the number of reinfections that would be expected to occur if no immunity to the disease occurred.
Results: Our model predicts that cases of reinfection should have been observed by now if primary SARS-CoV-2 infection did not protect individuals from subsequent exposure in the short term; however, no such cases have been documented.
Conclusions: This work concludes that infection with SARS-CoV-2 provides short-term immunity to reinfection and therefore offers useful insight for serological testing strategies, lockdown easing, and vaccine development.
JMIR Public Health Surveill 2020;6(4):e21168
The novel coronavirus SARS-CoV-2 is thought to have originated in China in late 2019, and has since spread globally, resulting in the COVID-19 pandemic. In the 8 months since the first confirmed case, the virus has resulted in 24 million confirmed infections and over 820,000 deaths, and has caused substantial social and economic damage.
SIR (Susceptible, Infected, Recovered) modeling uses a set of differential equations to determine how the number of infected and recovered individuals changes over time given a specified rate of infection and recovery. It was first used in 1927 by Kermack et al  and has since been used to model epidemics from AIDS [ ] to SARS (severe acute respiratory syndrome) [ ]. Variations of SIR modeling have been used during the COVID-19 pandemic to look at the varying burden on health care systems based on public health intervention [ ], the absence of a stable disease-free equilibrium [ ], and infection rate [ ], as well as the eventual size of the overall pandemic [ ]. An extension of the model has also been used to simulate the changing death rate as a function of the number of individuals infected, and it was found that an equilibrium point was reached where there were no further reinfections [ ].
In this study, we used an extension to the SIR framework that distinguished between infected and reinfected individuals to model empirical data taken from a compiled COVID-19 data set , in order to investigate the reinfection frequency of the disease. We aimed to determine if cases classified as “reinfections” will occur, although to date there is no definitive cases of reinfection reported in the scientific literature.
We used national infection and mortality data from a variety of sources to investigate the reinfection dynamics of SARS-CoV-2. Unless specified, national data on infections and deaths from SARS-CoV-2 were acquired from the Our World in Data database compiled by the Oxford Martin School at the University of Oxford ; the hospitalization cases in Switzerland were obtained from the Federal Office of Public Health in Switzerland [ ]; the data for the city of New York were obtained from the New York City Health website [ ]; the population of New York City was obtained from the 2019 New York census [ ]; and the recovery data for Germany was sourced from Trading Economics [ ], which obtains its data from the World Health Organization (WHO). For each geographical region, the data were taken from the date of the first recorded infection up until May 17, 2020, when the data were accessed.
Choice of Geographical Regions
The simulations were initially completed for the United Kingdom, where, at the time the data were accessed, there was a high number of confirmed cases. Australia was selected as an example of a region with low numbers of recorded cases, in order to investigate the limit of expected reinfections; Germany was selected as it was one of the few countries with recorded recovery data; Italy was studied since the number of infections and deaths had peaked by May 17, 2020; Singapore was unique as a city-state so population density for the nation was very high; Switzerland was selected since hospitalization data were available at the time the data were accessed; and the United States as a whole was compared with New York City, which was the worst affected part of the United States at the time.
A number of assumptions have been made. Where possible, they have been made so that the number of reinfections is underestimated. These assumptions are as follows:
- There is a large lag time for recovery to take place (28 days) [ , ].
- The incubation period was modeled as 6 days [ ].
- The model does not consider social distancing or shielding and so assigns an equal probability of an infection to all individuals.
- Not all infections have been recorded due to lack of testing, misdiagnosis, or asymptomatic infection [ ].
- Infections and recoveries are not necessarily recorded on the date that they first occurred.
- There is no emigration out of, or immigration into, a population of interest.
- The model assumes a homogeneous population density, with no societal structure (eg, equal number of residents per household).
We based our model on the compartmental SIR framework, but differentiated between initial and subsequent infections, resulting in a 6-state model (susceptible, infected, recovered, infected [2 or more times], recovered [2 or more times], and deceased) (and ). The number of infections and deaths each day was taken from national statistics (as described above). Where available, recovery data were used; otherwise, recoveries were modeled with a 28-day lag time (with the number of recoveries representing those individuals who did not die during the 28-day recovery time). “Recovered” individuals were selected stochastically from the populations of the states 28 days prior [ , ], “infected” individuals from the populations of the states 6 days prior (due to the incubation period [ ]), and “deceased” individuals from the populations of the states 1 day prior.
|t||The number of days into the simulation; t=1 for the day of the first infection|
|trecovery||The average number of days for recovery (trecovery=28)|
|tincubation = tinc.||The average number of days before an infection is seen (tincubation=6)|
|tmax||The number of days over which the simulation is run|
|St||The number of susceptible individuals on day t|
|It||The number of infected (once) individuals on day t|
|Rt||The number of recovered (once) individuals on day t|
|I’t||The number of infected (2 or more times) individuals on day t|
|R’t||The number of recovered (2 or more times) individuals on day t|
|Dt||The number of deceased individuals from SARS-CoV-2 on day t|
|Βt||The infection rate on day t|
|γ||The recovery rate on day t|
|mt||The death rate on day t|
|N||The total population in the model; N = St + It + Rt + I’t + R’t + Dt ∀ t|
|Ntinfected||The number of infected individuals on day t; Ntinfected = It + I’t|
|Ntuninfected||The number of uninfected individuals on day t; Ntuninfected = St + Rt + R’t|
|ntinfected||The number of new infections on day t; ntinfected = βtSt–tincubation(It–tincubation + I’t–tincubation) + βtRt–tincubation(It–tincubation + I’t–tincubation) + βtR’t–tincubation(It–tincubation + I’t–tincubation)|
|ntrecovered||The number of recovering individuals on day t; ntrecovered = γtIt–trecovery + γtI’t–trecovery|
|ntdeaths||The number of deaths on day t; ntdeaths = mtIt–1 + mtI’t–1|
|itfirst time||The number of first-time infections on day t (ie, the number of St → It transitions on day t)|
When using the model, the rates of infection, recovery, and fatality for each state were assumed to be independent of how many infections a host had previously had. The number of susceptible persons at the beginning of the simulation, N, was taken to be the population of the region of interest [, ]. After all infections, recoveries, and deaths for a day, the number of days into the simulation was increased by one, t → t + 1 up to tmax. The simulation was repeated 10,000 times to produce expectation values and standard deviations for the number of individuals classified as reinfections.
By pooling the number of cases in the infected (2 or more times), recovered (2 or more times), and deceased (after 2 or more infections) states at the end of the simulation, we calculated an estimate of the number of reinfections that would be expected to occur. This number represents the total population that had passed through the infected (2 or more times) state by the end of the simulation.
Unless otherwise stated, the average recovery time used in the simulations was set as 28 days, as this is greater than the median recovery time suggested in the report of the WHO-China Joint Mission on Coronavirus Disease 2019 .
Simulations of UK Infection Data Suggest a Small Number of Reinfections Should Have Occurred
We initially ran the simulation for data in the United Kingdom over the course of 106 days (from the first recorded case on February 1 until May 17, 2020, when the data were accessed).shows how the population of each state in the model changed over the course of a typical simulation. The number of susceptible individuals initially remained steady, until day 55, when there was a sharp decline due to the increase in primary infections ( A). The number of individuals infected just once started to increase steadily after day 40 and continued to do so throughout the simulation until day 92. After the 28-day lag time, the individuals infected once started to recover, resulting in an increase in the recovered (once) state through to the end of the simulation ( B). As the number of recovered individuals started to increase, so did the number of people infected for a second time. The number of people recovered for the second time started to increase after the 28-day recovery lag time ( C). The number of deaths started to rise from day 55 onwards, and fatalities continued to increase through to the end of the simulation ( D). In the United Kingdom, the number of expected reinfections was calculated to be 43 (SD 7), which makes up 0.018% of the total infections ( ). The first reinfection for the United Kingdom was on day 82 (SD 5), corresponding to April 22, 2020.
|Region||Reinfections, mean (SD)||Infections, N||Reinfections as a % of the total infections|
|Germany (with recovery data)||79 (9)||174,355||0.05|
|New York City||209 (15)||189,027||0.11|
|New York City (hospitalizations)||7 (3)||48,462||0.004|
|United Kingdom||43 (7)||240,161||0.018|
|United States||402 (20)||1,467,884||0.027|
Simulations of Infection Data in Other Regions Show a Similar Trend
The simulations were repeated with data from Australia, Italy, New York City, Singapore, Switzerland, and the United States. The mean number of expected reinfections in each region or country for the 10,000 simulations that were run are shown in. In all cases, with the exception of Australia, our model predicts that reinfection cases should occur.
Comparison of Infection and Hospitalization Data in New York City
Next, we repeated our simulation for New York City, with the total number of infections replaced by the number of hospitalizations. When we ran the simulation with an input of the total number of infections, the number of secondary infections continued to increase throughout the simulation, when the numbers appear to start to peak (A). This was followed by an increase in the number of secondary recoveries after the 28-day recovery lag time. In comparison, the hospitalization data for New York showed no secondary recoveries as the reinfections occurred later into the simulation ( B). The total number of predicted reinfections from the New York hospitalized data was 12 (SD 4) ( ).
Inclusion of Recovery Data Suggests That Predicted Reinfections Are Underestimated
Recovery data was sparse or unavailable for most regions, likely due to lack of follow-up testing. Recovery data were available from Germany, and we therefore compared the results of our simulation for Germany with and without the recovery data as an input. The models used a 28-day lag before the recoveries started, meaning very few secondary recoveries took place (A and B). There were 73 more reinfections with the reinfection data than with the modeled data ( ).
The 28-day lag time used for the modeled recovery data ensured that we underestimated the recovery rate, and therefore the rate of reinfection as well. To investigate a more life-like recovery rate, the United Kingdom simulations were repeated again using the modeled recovery data, while shortening the lag time for recovery. As expected, we found that the rate of reinfection increased as the lag time was decreased from 28 days through to 7 days, as there was a larger population that recovered from a primary infection. With a 7-day lag time, the number of people in the infected (2 or more times) state peaked at day 101 of the simulation (). The total number of people reinfected throughout the simulation increased as the lag time decreased, with 43 (SD 7), 83 (SD 9), 139 (SD 12), and 209 (SD 14) reinfections for 28-day, 21-day, 14-day, and 7-day recovery lag times, respectively.
In this work, we have presented a modeling strategy used to determine whether SARS-CoV-2 reinfections can occur. We modeled actual infection and fatality data from different regions around the world and found that all regions investigated, with the exception of Australia, should have recorded cases of reinfections if primary infection with SARS-CoV-2 did not provide some level of immunity. The actual number of cases of reinfection that have been reported in any of these regions or countries to date is zero, suggesting that worldwide, primary SARS-CoV-2 infection provides short-term immunity.
In Australia, the number of confirmed SARS-CoV-2 infections at the time the data were accessed was relatively low , possibly due to early social distancing measures, the closing of international borders, and mass testing and tracing measures. The number of modeled reinfections (0.1 [SD 0.3]; ) reflects this, and so even without immunity from infection no reinfections would be expected to occur. Similarly, in Switzerland and Singapore, very low numbers of reinfections were predicted by the model (6.2 [SD 2.5] and 6 [SD 2], respectively; ). It is possible that these very low numbers of reinfection cases could have been missed due to misdiagnosis or lack of follow-up testing. We therefore applied our model to data from Germany [ ], Italy [ ], New York City [ , ], and the United States as a whole [ ], which have recorded far higher numbers of SARS-CoV-2 infections (174,355; 224,760; 189,027; and 1,467,884, respectively, when the data were accessed). The number of reinfection cases predicted for these countries was 30 (SD 6), 89 (SD 9), 335 (SD 18), and 635 (SD 25) for Germany, Italy, New York, and the United States, respectively ( ). We conclude that it is therefore very unlikely that all of these predicted cases, if true, were missed due to misdiagnosis or lack of testing.
We also found that rehospitalization cases should have been seen amongst hospitalized cases in New York City—it is unlikely that these cases would be missed as people are processed and tested on admission into hospital. To date, however, no reinfections have verifiably been recorded anywhere in the world. A report from South Korea suggested that 116 patients recovered from COVID-19 had tested positive by RT-PCR (reverse transcription–polymerase chain reaction) for the virus again ; however, this has since been explained as the “false-positive” detection of remnants of viral RNA (ribonucleic acid) rather than reactivation or reinfection. The lack of documented reinfections suggests that short-term immunity to the virus is produced by an initial infection, although our model cannot predict whether this immunity will last over longer timescales.
Our results are supported by a number of animal challenge studies, which also show that immunity to SARS-CoV-2 can be conferred. A study in rhesus macaques showed that, following initial viral clearance, the monkeys showed a reduction in their median viral load in comparison with primary infection when rechallenged with SARS-CoV-2 . Similarly, Ryan et al [ ] demonstrated that rechallenged ferrets were fully protected from acute lung pathology. An adenovirus-vector vaccine tested on rhesus macaques elicited a humoral and cellular response that, on challenge with the virus, proved to significantly reduce the viral load in bronchoalveolar lavage fluid and respiratory tract tissue [ ]. However, a longitudinal study by Seow et al [ ] showed that the immunity conferred against SARS-CoV-2 may only be short term. Our model proposes that reinfection cases should have already started to appear by April 2020, suggesting a possible lower limit for immunity duration.
A report from the WHO-China Joint Mission on Coronavirus Disease 2019 estimated the recovery time for SARS-CoV-2 infection to be 2 weeks for mild cases and 3-6 weeks for severe or critical cases ; based on this we used a long (28 days) recovery lag time in the modeled data. Comparison with real-world recovery data from Germany suggested that the actual recovery time may be significantly shorter, giving rise to an underestimation of the reinfection rate in our modeled data. This was supported by an increase in the number of predicted reinfections in the UK simulations when we used a shorter recovery lag time of 7, 14, or 21 days. In addition, there were no allowances in our model for transmission being localized to regions smaller than a nation or city; the daily infection data were likely to be only a fraction of the total number of infections due to asymptomatic or mild infections not being recorded, and infections were recorded on the date of testing, not the actual date of infection. We also note that significant differences in testing, reporting, and shielding of the vulnerable exist between the different regions in this study and that a large number of COVID-19 cases were missed in every region of interest (eg, in Geneva, unreported cases were estimated to be 11.6 infections per reported infection from April 6 to May 9 [ ]). In every region, we expect that the impact on our simulation would be to underestimate the number of reinfections. Taken together, this suggests that the actual reinfection rate would be significantly higher than that predicted by our model if there was no immunity conferred by prior infection.
Our model has a number of limitations, including the lack of modeling of any social structure, the fact that individuals who have been infected may change their shielding behaviors, differing recovery times from person to person, and missing information regarding immigration into and out of regions of interest. In spite of this, the results documented here provide strong evidence, based on real data, to suggest that that there is at least short-term immunity conferred by an initial infection of SARS-CoV-2. This has implications for serological testing strategies, lockdown easing timescales, and vaccine development. Our modeling strategy can also be extended to understand the reinfection dynamics of future pandemics.
We thank Dr Barak Gilboa for a critical reading of the manuscript. This work was supported by a Royal Society Dorothy Hodgkin Research Fellowship (DKR00620) and a Research Grant for Research Fellows (RGF\R1\180054) to NCR. Data and code available on request.
Conflicts of Interest
Plots of infected (2 or more times) and recovered (2 or more times) populations in the United Kingdom when the lag time used was (A) 28 days, (B) 21 days, (C) 14 days, and (D) 7 days. Use of lower recovery lag times leads to an increase in the number of expected reinfections.PDF File (Adobe PDF File), 314 KB
- Kermack WO, McKendrick AG, Walker GT. A contribution to the mathematical theory of epidemics. Proc R Soc Lond A 1997 Jan;115(772):700-721. [CrossRef]
- Victor Okhuese A. Estimation of the Probability of Reinfection With COVID-19 by the Susceptible-Exposed-Infectious-Removed-Undetectable-Susceptible Model. JMIR Public Health Surveill 2020 May 13;6(2):e19097 [FREE Full text] [CrossRef] [Medline]
- Ng TW, Turinici G, Danchin A. A double epidemic model for the SARS propagation. BMC Infect Dis 2003 Sep 10;3(1):19 [FREE Full text] [CrossRef] [Medline]
- Ming W, Zhang C. Breaking down of healthcare system: mathematical modelling for controlling the novel coronavirus (2019-nCoV) outbreak in Wuhan, China. bioRxiv Preprint posted online January 30, 2020. [CrossRef]
- Victor A. Mathematical predictions for COVID-19 as a global pandemic. SSRN Journal 2020:e. [CrossRef]
- Nesteruk I. Statistics based predictions of coronavirus 2019-nCoV spreading in mainland China. medRxiv Preprint posted online February 13, 2020. [CrossRef]
- Batista M. Estimation of the final size of the COVID-19 epidemic. medRxiv Preprint posted online February 28, 2020. [CrossRef]
- Okhuese AV. Estimation of the Probability of Reinfection With COVID-19 by the Susceptible-Exposed-Infectious-Removed-Undetectable-Susceptible Model. JMIR Public Health Surveill 2020 May 13;6(2):e19097 [FREE Full text] [CrossRef] [Medline]
- Ritchie H, Roser M, Ortiz-Ospina E, Hasell J. Coronavirus pandemic (COVID-19) - Country by country. Our World in Data. 2020. URL: https://ourworldindata.org/coronavirus [accessed 2020-05-17]
- New coronavirus: Situation in Switzerland. Federal Office of Public Health. 2020. URL: https://www.bag.admin.ch/bag/en/home/krankheiten/ausbrueche-epidemien-pandemien/aktuelle-ausbrueche-epidemien/novel-cov/situation-schweiz-und-international.html [accessed 2020-05-17]
- Covid-19: Data. NYC Health. 2020. URL: https://www1.nyc.gov/site/doh/covid/covid-19-data.page [accessed 2020-05-17]
- United States Census Bureau. QuickFacts: Richmond County (Staten Island Borough), New York; New York County (Manhattan Borough), New York; Bronx County (Bronx Borough), New York; Kings County (Brooklyn Borough), New York; Queens County (Queens Borough), New York. Census.gov. 2019. URL: tinyurl.com/ybmwj89r [accessed 2020-05-17]
- Germany Coronavirus Recovered. Trading Economics. 2020. URL: https://tradingeconomics.com/germany/coronavirus-recovered [accessed 2020-05-17]
- Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). World Health Organization. 2020 Feb. URL: https://www.who.int/docs/default-source/coronaviruse/who-china-joint-mission-on-covid-19-final-report.pdf [accessed 2020-05-17]
- Bi Q, Wu Y, Mei S, Ye C, Zou X, Zhang Z, et al. Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. The Lancet Infectious Diseases 2020 Aug;20(8):911-919 [FREE Full text] [CrossRef]
- Backer JA, Klinkenberg D, Wallinga J. Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China, 20-28 January 2020. Euro Surveill 2020 Feb;25(5):e [FREE Full text] [CrossRef] [Medline]
- Stringhini S, Wisniak A, Piumatti G, Azman AS, Lauer SA, Baysson H, et al. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Geneva, Switzerland (SEROCoV-POP): a population-based study. The Lancet 2020 Aug;396(10247):313-319. [CrossRef]
- Cha S, Smith J. Explainer: South Korean findings suggest 'reinfected' coronavirus cases are false positives. Reuters. 2020 May 7. URL: https://www.reuters.com/article/us-health-coronavirus-southkorea-explain-idUSKBN22J0HR [accessed 2020-05-17]
- Chandrashekar A, Liu J, Martinot AJ, McMahan K, Mercado NB, Peter L, et al. SARS-CoV-2 infection protects against rechallenge in rhesus macaques. Science 2020 Aug 14;369(6505):812-817 [FREE Full text] [CrossRef] [Medline]
- Ryan K, Bewley K, Fotheringham S, Brown P, Hall Y, Marriott AC, et al. Dose-dependent response to infection with SARS-CoV-2 in the ferret modelvidence of protection to re-challenge. bioRxiv Preprint posted May 29, 2020. [CrossRef]
- van Doremalen N, Lambe T, Spencer A, Belij-Rammerstorfer S, Purushotham JN, Port JR, et al. ChAdOx1 nCoV-19 vaccination prevents SARS-CoV-2 pneumonia in rhesus macaques. bioRxiv Preprint posted online May 13, 2020. [CrossRef]
- Seow J, Graham C, Merrick B, Acors S, Steel KJA, Hemmings O, et al. Longitudinal evaluation and decline of antibody responses in SARS-CoV-2 infection. medRxiv Preprint posted online July 11, 2020. [CrossRef]
|RNA: ribonucleic acid|
|RT-PCR: reverse transcription–polymerase chain reaction|
|SARS: severe acute respiratory syndrome|
|SIR: Susceptible, Infected, Recovered|
|WHO: World Health Organization|
Edited by T Sanchez; submitted 09.06.20; peer-reviewed by M Salman, S Case; comments to author 20.07.20; revised version received 31.08.20; accepted 22.09.20; published 16.11.20
©Andrew McMahon, Nicole C Robb. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 16.11.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.