This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.

The novel coronavirus SARS-CoV-2, which causes the COVID-19 disease, has resulted in a global pandemic. Since its emergence in December 2019, the virus has infected millions of people, caused the deaths of hundreds of thousands, and resulted in incalculable social and economic damage. Understanding the infectivity and transmission dynamics of the virus is essential to determine how best to reduce mortality while ensuring minimal social restrictions on the lives of the general population. Anecdotal evidence is available, but detailed studies have not yet revealed whether infection with the virus results in immunity.

The objective of this study was to use mathematical modeling to investigate the reinfection frequency of COVID-19.

We have used the SIR (Susceptible, Infected, Recovered) framework and random processing based on empirical SARS-CoV-2 infection and fatality data from different regions to calculate the number of reinfections that would be expected to occur if no immunity to the disease occurred.

Our model predicts that cases of reinfection should have been observed by now if primary SARS-CoV-2 infection did not protect individuals from subsequent exposure in the short term; however, no such cases have been documented.

This work concludes that infection with SARS-CoV-2 provides short-term immunity to reinfection and therefore offers useful insight for serological testing strategies, lockdown easing, and vaccine development.

The novel coronavirus SARS-CoV-2 is thought to have originated in China in late 2019, and has since spread globally, resulting in the COVID-19 pandemic. In the 8 months since the first confirmed case, the virus has resulted in 24 million confirmed infections and over 820,000 deaths, and has caused substantial social and economic damage.

SIR (Susceptible, Infected, Recovered) modeling uses a set of differential equations to determine how the number of infected and recovered individuals changes over time given a specified rate of infection and recovery. It was first used in 1927 by Kermack et al [

In this study, we used an extension to the SIR framework that distinguished between infected and reinfected individuals to model empirical data taken from a compiled COVID-19 data set [

We used national infection and mortality data from a variety of sources to investigate the reinfection dynamics of SARS-CoV-2. Unless specified, national data on infections and deaths from SARS-CoV-2 were acquired from the Our World in Data database compiled by the Oxford Martin School at the University of Oxford [

The simulations were initially completed for the United Kingdom, where, at the time the data were accessed, there was a high number of confirmed cases. Australia was selected as an example of a region with low numbers of recorded cases, in order to investigate the limit of expected reinfections; Germany was selected as it was one of the few countries with recorded recovery data; Italy was studied since the number of infections and deaths had peaked by May 17, 2020; Singapore was unique as a city-state so population density for the nation was very high; Switzerland was selected since hospitalization data were available at the time the data were accessed; and the United States as a whole was compared with New York City, which was the worst affected part of the United States at the time.

A number of assumptions have been made. Where possible, they have been made so that the number of reinfections is underestimated. These assumptions are as follows:

There is a large lag time for recovery to take place (28 days) [

The incubation period was modeled as 6 days [

The model does not consider social distancing or shielding and so assigns an equal probability of an infection to all individuals.

Not all infections have been recorded due to lack of testing, misdiagnosis, or asymptomatic infection [

Infections and recoveries are not necessarily recorded on the date that they first occurred.

There is no emigration out of, or immigration into, a population of interest.

The model assumes a homogeneous population density, with no societal structure (eg, equal number of residents per household).

We based our model on the compartmental SIR framework, but differentiated between initial and subsequent infections, resulting in a 6-state model (susceptible, infected, recovered, infected [2 or more times], recovered [2 or more times], and deceased) (

A simple representation of the model. S_{t} represents the number of persons susceptible to infection who have had no prior infections on day t; I_{t} is the number of people currently infected for the first time on day t; R_{t} is the number of people who have recovered once on day t; I’_{t} is the number of people who have been infected 2 or more times and are infected on day t; R’_{t} is the number of people who have recovered 2 or more times and are not infected on day t of the model, and D_{t} is the number of deceased persons on day t of the model. Further symbols are defined in Table 1.

Definition of parameters in the model.

Parameter | Definition |

x_{i} |
Random number |

t | The number of days into the simulation; t=1 for the day of the first infection |

t_{recovery} |
The average number of days for recovery (t_{recovery}=28) |

t_{incubation} = t_{inc.} |
The average number of days before an infection is seen (t_{incubation}=6) |

t_{max} |
The number of days over which the simulation is run |

S_{t} |
The number of susceptible individuals on day t |

I_{t} |
The number of infected (once) individuals on day t |

R_{t} |
The number of recovered (once) individuals on day t |

I’_{t} |
The number of infected (2 or more times) individuals on day t |

R’_{t} |
The number of recovered (2 or more times) individuals on day t |

D_{t} |
The number of deceased individuals from SARS-CoV-2 on day t |

Β_{t} |
The infection rate on day t |

γ | The recovery rate on day t |

m_{t} |
The death rate on day t |

N | The total population in the model; N = S_{t} + I_{t} + R_{t} + I’_{t} + R’_{t} + D_{t} ∀ t |

N_{t}^{infected} |
The number of infected individuals on day t; N_{t}^{infected} = I_{t} + I’_{t} |

N_{t}^{uninfected} |
The number of uninfected individuals on day t; N_{t}^{uninfected} = S_{t} + R_{t} + R’_{t} |

n_{t}^{infected} |
The number of new infections on day t; n_{t}^{infected} = β_{t}S_{t–tincubation}(I_{t–tincubation} + I’_{t–tincubation}) + β_{t}R_{t–tincubation}(I_{t–tincubation} + I’_{t–tincubation}) + β_{t}R’_{t–tincubation}(I_{t–tincubation} + I’_{t–tincubation}) |

n_{t}^{recovered} |
The number of recovering individuals on day t; n_{t}^{recovered} = γ_{t}I_{t–trecovery} + γ_{t}I’_{t–trecovery} |

n_{t}^{deaths} |
The number of deaths on day t; n_{t}^{deaths} = m_{t}I_{t–1} + m_{t}I’_{t–1} |

i_{t}^{first time} |
The number of first-time infections on day t (ie, the number of S_{t} → I_{t} transitions on day t) |

When using the model, the rates of infection, recovery, and fatality for each state were assumed to be independent of how many infections a host had previously had. The number of susceptible persons at the beginning of the simulation, N, was taken to be the population of the region of interest [_{max}. The simulation was repeated 10,000 times to produce expectation values and standard deviations for the number of individuals classified as reinfections.

By pooling the number of cases in the infected (2 or more times), recovered (2 or more times), and deceased (after 2 or more infections) states at the end of the simulation, we calculated an estimate of the number of reinfections that would be expected to occur. This number represents the total population that had passed through the infected (2 or more times) state by the end of the simulation.

Unless otherwise stated, the average recovery time used in the simulations was set as 28 days, as this is greater than the median recovery time suggested in the report of the WHO-China Joint Mission on Coronavirus Disease 2019 [

We initially ran the simulation for data in the United Kingdom over the course of 106 days (from the first recorded case on February 1 until May 17, 2020, when the data were accessed).

Plots of the populations of each state in the model over the course of a typical simulation, using infection data from the United Kingdom. (A) An example plot of the susceptible population in the model over the course of 106 days (from the first recorded case on February 1 until May 17, 2020, when the data were accessed). (B) An example plot of the populations that are infected for the first time or recovered from a single infection. (C) An example plot of the populations of the simulation that have been reinfected and have recovered from an infection twice. (D) An example plot of the number of deceased individuals through the course of the simulation.

The number of predicted reinfections and their standard deviation in different locations worldwide as predicted from the model. Unless otherwise stated, these figures represent simulations using the total number of infections for each region and are modeled without the data on the number of recoveries.

Region | Reinfections, mean (SD) | Infections, N | Reinfections as a % of the total infections |

Australia | 0.1 (0.3) | 7036 | 0.0014 |

Germany | 20 (5) | 174,355 | 0.011 |

Germany (with recovery data) | 79 (9) | 174,355 | 0.05 |

Italy | 63 (8) | 224,760 | 0.028 |

New York City | 209 (15) | 189,027 | 0.11 |

New York City (hospitalizations) | 7 (3) | 48,462 | 0.004 |

Singapore | 4 (2) | 27,356 | 0.014 |

Switzerland | 4 (2) | 30,587 | 0.013 |

United Kingdom | 43 (7) | 240,161 | 0.018 |

United States | 402 (20) | 1,467,884 | 0.027 |

The simulations were repeated with data from Australia, Italy, New York City, Singapore, Switzerland, and the United States. The mean number of expected reinfections in each region or country for the 10,000 simulations that were run are shown in

Next, we repeated our simulation for New York City, with the total number of infections replaced by the number of hospitalizations. When we ran the simulation with an input of the total number of infections, the number of secondary infections continued to increase throughout the simulation, when the numbers appear to start to peak (

Comparison of total infections versus hospitalization data in New York City. Plots of the infected (2 or more times) and recovered (2 or more times) states for (A) New York using all infection data and (B) New York using only the hospitalization data.

Recovery data was sparse or unavailable for most regions, likely due to lack of follow-up testing. Recovery data were available from Germany, and we therefore compared the results of our simulation for Germany with and without the recovery data as an input. The models used a 28-day lag before the recoveries started, meaning very few secondary recoveries took place (

Plots of infected (2 or more times) and recovered (2 or more times) states with (A) modeled recovery data and (B) actual recovery data. Use of actual recovery data from Germany suggests that the number of recovered individuals, and hence reinfections, are underestimated in our model.

The 28-day lag time used for the modeled recovery data ensured that we underestimated the recovery rate, and therefore the rate of reinfection as well. To investigate a more life-like recovery rate, the United Kingdom simulations were repeated again using the modeled recovery data, while shortening the lag time for recovery. As expected, we found that the rate of reinfection increased as the lag time was decreased from 28 days through to 7 days, as there was a larger population that recovered from a primary infection. With a 7-day lag time, the number of people in the infected (2 or more times) state peaked at day 101 of the simulation (

In this work, we have presented a modeling strategy used to determine whether SARS-CoV-2 reinfections can occur. We modeled actual infection and fatality data from different regions around the world and found that all regions investigated, with the exception of Australia, should have recorded cases of reinfections if primary infection with SARS-CoV-2 did not provide some level of immunity. The actual number of cases of reinfection that have been reported in any of these regions or countries to date is zero, suggesting that worldwide, primary SARS-CoV-2 infection provides short-term immunity.

In Australia, the number of confirmed SARS-CoV-2 infections at the time the data were accessed was relatively low [

We also found that rehospitalization cases should have been seen amongst hospitalized cases in New York City—it is unlikely that these cases would be missed as people are processed and tested on admission into hospital. To date, however, no reinfections have verifiably been recorded anywhere in the world. A report from South Korea suggested that 116 patients recovered from COVID-19 had tested positive by RT-PCR (reverse transcription–polymerase chain reaction) for the virus again [

Our results are supported by a number of animal challenge studies, which also show that immunity to SARS-CoV-2 can be conferred. A study in rhesus macaques showed that, following initial viral clearance, the monkeys showed a reduction in their median viral load in comparison with primary infection when rechallenged with SARS-CoV-2 [

A report from the WHO-China Joint Mission on Coronavirus Disease 2019 estimated the recovery time for SARS-CoV-2 infection to be 2 weeks for mild cases and 3-6 weeks for severe or critical cases [

Our model has a number of limitations, including the lack of modeling of any social structure, the fact that individuals who have been infected may change their shielding behaviors, differing recovery times from person to person, and missing information regarding immigration into and out of regions of interest. In spite of this, the results documented here provide strong evidence, based on real data, to suggest that that there is at least short-term immunity conferred by an initial infection of SARS-CoV-2. This has implications for serological testing strategies, lockdown easing timescales, and vaccine development. Our modeling strategy can also be extended to understand the reinfection dynamics of future pandemics.

Plots of infected (2 or more times) and recovered (2 or more times) populations in the United Kingdom when the lag time used was (A) 28 days, (B) 21 days, (C) 14 days, and (D) 7 days. Use of lower recovery lag times leads to an increase in the number of expected reinfections.

ribonucleic acid

reverse transcription–polymerase chain reaction

severe acute respiratory syndrome

Susceptible, Infected, Recovered

World Health Organization

We thank Dr Barak Gilboa for a critical reading of the manuscript. This work was supported by a Royal Society Dorothy Hodgkin Research Fellowship (DKR00620) and a Research Grant for Research Fellows (RGF\R1\180054) to NCR. Data and code available on request.

None declared.