Published on in Vol 7, No 4 (2021): April

Preprints (earlier versions) of this paper are available at https://www.medrxiv.org/content/10.1101/2020.04.23.20076562v1, first published .
A Recursive Model of the Spread of COVID-19: Modelling Study

A Recursive Model of the Spread of COVID-19: Modelling Study

A Recursive Model of the Spread of COVID-19: Modelling Study

Authors of this article:

Sergey O Ilyin1 Author Orcid Image

Original Paper

AV Topchiev Institute of Petrochemical Synthesis, Russian Academy of Sciences, Moscow, Russian Federation

Corresponding Author:

Sergey O Ilyin, PhD

AV Topchiev Institute of Petrochemical Synthesis, Russian Academy of Sciences

29 Leninsky prospekt

Moscow, 119991

Russian Federation

Phone: 7 9168276852

Email: s.o.ilyin@gmail.com


Background: The major medical and social challenge of the 21st century is COVID-19, caused by the novel coronavirus SARS-CoV-2. Critical issues include the rate at which the coronavirus spreads and the effect of quarantine measures and population vaccination on this rate. Knowledge of the laws of the spread of COVID-19 will enable assessment of the effectiveness and reasonableness of the quarantine measures used, as well as determination of the necessary level of vaccination needed to overcome this crisis.

Objective: This study aims to establish the laws of the spread of COVID-19 and to use them to develop a mathematical model to predict changes in the number of active cases over time, possible human losses, and the rate of recovery of patients, to make informed decisions about the number of necessary beds in hospitals, the introduction and type of quarantine measures, and the required threshold of vaccination of the population.

Methods: This study analyzed the onset of COVID-19 spread in countries such as China, Italy, Spain, the United States, the United Kingdom, Japan, France, and Germany based on publicly available statistical data. The change in the number of COVID-19 cases, deaths, and recovered persons over time was examined, considering the possible introduction of quarantine measures and isolation of infected people in these countries. Based on the data, the virus transmissibility and the average duration of the disease at different stages were evaluated, and a model based on the principle of recursion was developed. Its key features are the separation of active (nonisolated) infected persons into a distinct category and the prediction of their number based on the average duration of the disease in the inactive phase and the concentration of these persons in the population in the preceding days.

Results: Specific values for SARS-CoV-2 transmissibility and COVID-19 duration were estimated for different countries. In China, the viral transmissibility was 3.12 before quarantine measures were implemented and 0.36 after these measures were lifted. For the other countries, the viral transmissibility was 2.28-2.76 initially, and it then decreased to 0.87-1.29 as a result of quarantine measures. Therefore, it can be expected that the spread of SARS-CoV-2 will be suppressed if 56%-64% of the total population becomes vaccinated or survives COVID-19.

Conclusions: The quarantine measures adopted in most countries are too weak compared to those previously used in China. Therefore, it is not expected that the spread of COVID-19 will stop and the disease will cease to exist naturally or owing to quarantine measures. Active vaccination of the population is needed to prevent the spread of COVID-19. Furthermore, the required specific percentage of vaccinated individuals depends on the magnitude of viral transmissibility, which can be evaluated using the proposed model and statistical data for the country of interest.

JMIR Public Health Surveill 2021;7(4):e21468

doi:10.2196/21468

Keywords



The first mathematical models to predict the development of infectious diseases were used in the early 20th century [1,2]. In 1927, Kermack and McKendrick [3] proposed the use of differential equations for calculations, dividing the human population into people susceptible to disease (S) and those who had already recovered (R). The susceptible persons became infected (I) at some rate of transmission and then recovered at a different rate. Their model became known by the acronym SIR, which means that the model simultaneously calculates the number of susceptible, infected, and recovered persons. This model served as a basis for the development of subsequent models—by modifying the equations and adding to the calculation other persons not belonging to the three specified basic categories, which allowed consideration of the features of particular diseases. Since then, various models have been created that consider the possibility of re-infection (SIS model) [4] and death (SIRD model) [5], the existence of an incubation period (SEIR model) [6], and temporary immunity of infants (MSIR model) [7], among others.

When a new infection appears, neither the set of population categories to be considered in the model nor the rate of transition of people from one category to another is known. Current information about the features of the COVID-19 infection caused by the novel coronavirus (SARS-CoV-2) and the manner in which people perceive it and act should serve as a basis for building a model to describe the spread of this virus. These features can be described as follows: first, the presence of a long incubation period, during which the infected persons are contagious to others, and second, the isolation of discovered infected persons, which as a result become conditionally noncontagious. The combination of these two factors makes this novel coronavirus infection unique. In general, the opposite is true—infected people are not dangerous to others during the incubation period and become contagious after its expiry. For this reason, a new model that considers these circumstances is needed to predict the spread of COVID-19. However, the duration of the immunity produced after recovery from COVID-19 is currently unknown. In addition, there is also very little information available to accurately calculate the rate of recovery among patients with COVID-19: a small percentage of the population recovers within just a week after contracting infection, whereas the majority of people experience the illness for a long time. Therefore, the proposed model cannot be final, but it is necessary for forecasting and management decisions.


The model for COVID-19 spread is based on a set of parameters whose values are unique for each country due to differences in population density, human behavior, date of virus penetration, and government actions. The set includes the following parameters:

  • d0 is the date of the initiation of the epidemic; it is not the date of detection of the first infected person but the date of appearance of the first undetected (or detected too late) person.
  • d1, d2, and d3 are dates of change in the behavior of the population, for example, due to the awareness of the reality of what is happening and the introduction of quarantine and its tightening.
  • tD is the average time from infection to isolation of the infected person, which is equal to the incubation period assumed to be 6 days (ranging from 5.2 to 6.4 days according to different sources [8,9]); theoretically, this parameter can be reduced by testing of the entire population, but it is feasible only for small communities.
  • R0, R1, R2, and R3 are the viral transmissibilities that are equal to the average number of people who will be infected by one person before he or she is isolated and depend on the behavior of the population at different stages of the epidemic; when R is less than 1.0, the epidemic fades, and vice versa.
  • r0, r1, r2, and r3 are the reduced viral transmissibilities that are equal to the average number of people who will be infected by one person per day: r = R/tD; to suppress the spread of COVID-19, r should be less than 0.167.

The evaluation of the spread of the virus is based on the calculation of the following data:

  • ND(di) is the number of infected persons detected on di date, which equals the total number of infected persons 6 days earlier:
ND(di) = NT(di-tD)
  • NT(di) is the total number of infected persons on date di, which is the sum of the total number of infected persons the day before and the number of new infected persons that, in turn, is equal to the product of the reduced transmissibility and the number of active infected persons the day before (taking into account that those who have been previously infected cannot be reinfected):
NT(di) = NT(di-1)+r0×NA(di-1)×[1-NT(di-1)/NP],

where NP is the total population.

In the case of vaccination of the population and considering the temporary nature of the immunity received due to SARS-CoV-2 infection or vaccination, the above expression will be as follows:

NT(di) = NT(di-1)+r0×NA(di-1)×[1-[NT(di-1)+ NV(di-1)-NT(di-1-tim)-NV(di-1-tim)]/NP],

where tim is the average duration of preserving full immunity against the virus after vaccination or disease, whereas NV(di) is the total number of vaccinated persons on date di who have not had COVID-19 in the last tim days;

  • NA(di) is the total number of active (undetected) infected persons on date di, which equals the difference between the total number of infected persons and the number of infected persons detected on the same day:
NA(di) = NT(di)-ND(di).

At the start of the epidemic (date d0), NA(d0) = 1, NT(d0) = 1, and ND(d0) = 0.

Thus, in order to calculate the virus spread dynamics, it is necessary to know the values of only two parameters—d0 and r0. In the case of changing the behavior of the population from the date d1, parameter r0 changes its value from this date to become r1. If the behavior changes again, a pair of d2 and r2 will appear, and so on.

However, it is more difficult to model human losses correctly. Two more parameters need to be considered:

  • L is the apparent lethality rate that is equal to the ratio of the number of deaths to the sum of those who died or recovered;
  • tL is the average time from infection to death.

These two parameters depend on the efficacy of treatment and may vary as physicians gain experience and as hospitals overflow. The number of deaths on date di equals the total number of people infected tL days earlier multiplied by the lethality rate:

NL(di) = NT(di-tLL

Because of the presence of two parameters (tL and L) in the equation, which have the same effect on the resulting value, the precision of their evaluation is lower than that for viral transmissibility. It should be understood that the fewer the number of asymptomatic and mild cases of the disease have been detected, the more the lethality rate is overestimated. The average time from infection to death was found to be about 8 days, and this duration will be used to make calculations for all countries.

The situation with predicting the number of recovered persons is even worse due to the appearance of an even greater number of independent parameters:

NR(di) = NT(di-tMkM+NT(di-tSkS

where kM and kS are the shares of mildly and seriously ill patients (kM+kS+L = 1), and tM and tS are the corresponding times from infection to healing:

kM+kS+= 1

The model equations are presented in the discrete form (instead of differential one), so that the model can be easily reproduced for calculations in any spreadsheet editor. At first glance, it seems that the model does not take into account the existence of asymptomatic carriers of infection, but this is not true: since the share of asymptomatic carriers in the population does not change over time, their presence is taken into account implicitly by the value of the transmissibility. This model can be denoted by the abbreviation SILRD, which means that it takes into account Susceptible, Infected, Isolated, Recovered, and Dead persons.


Based on historical data on disease development in eight countries (China, Italy, Spain, the United States, the United Kingdom, Japan, France, and Germany [10]), the model was tested (Figure 1) and most of its parameters were found (Table 1). For all the countries, the viral transmissibility at the start of the epidemic was between 2.28 and 3.12. The highest viral transmissibility was found in China, wherein one person infected three others, probably because of higher population density. The introduction and progressive strengthening of quarantine measures resulted in a decrease in the viral transmissibility, which was noticeable 6 days later in the decline in the rate of new cases. All the countries had introduced quarantine measures gradually. The initial restrictions reduced the viral transmissibility to 1.20-1.74, which was not adequate (it was necessary to achieve a transmissibility of less than 1.0), and the virus continued to spread with acceleration. As a result, all the countries, with the exception of Japan, initiated stricter measures, thus reducing the transmissibility to 0.87-1.14. Japan focused on timely detection and isolation of infected persons. It can be concluded that this strategy does not work, as can be observed from the curve of the total number of cases in Japan that alternately slows down and then accelerates again. This is the result of the fact that Japan has been successfully isolating most of the infected persons, but a few infected people remain nonisolated and they can cause another outbreak to occur. By contrast, China further strengthened its containment measures, which resulted in a reduction of the transmissibility to 0.36 and a quick win over the epidemic (in 6 weeks according to the model).

Figure 1. Time dependences of the total number of COVID-19 cases, deaths, and recovered cases. Dots show actual data, whereas lines represent the result of calculations using the model.
View this figure
Table 1. Parameters identified from the models used in different countries.
ParameterChinaItalySpainUSAJapanUKGermanyFrance
R03.122.552.762.462.462.282.342.34
d031.12.1901.02.2012.02.2011.02.2019.01.2011.02.2008.02.2007.02.20
R11.681.741.561.201.291.741.321.56
d123.01.2026.02.2010.03.2021.03.2027.01.2010.03.2014.03.2007.03.20
R21.141.380.900.90N/Aa1.320.870.87
d229.01.2008.03.2023.03.2006.04.20N/A24.03.2024.03.2026.03.20
R30.360.92N/AN/AN/A1.05N/AN/A
d310.02.2018.03.20N/AN/AN/A01.04.20N/AN/A
Lb0.040.05–0.140.03–0.1050.03–0.060.0270.03–0.1350.004–0.040.03–0.14
kM0.240.160.50.140.973N/A0.750.3
tM149141431N/A1712
tS2939unknownunknown31N/Aunknownunknown
dEN/A12.06.2124.03.2130.05.2121.03.21>31.12.2131.12.2003.01.21
NT,maxN/A3100003300001690000528000002270000210000220000

aNot applicable.

bIf an interval is indicated, it means gradual growth.


On the date on which the analyzed data set ends, all European countries (except the United Kingdom) only managed to reduce the transmissibility slightly below 1.0. From a practical point of view, this means that the number of people falling ill on a daily basis in these countries was gradually decreasing, but it was at such a slow pace that the end date of the epidemic in these countries could not have been before, at best, the end of 2020. In reality, these countries have partially canceled quarantine measures, causing an increase in viral transmissibility and, consequently, a new rise in the number of infected persons and a shift in the date of a possible end of the epidemic to the future. It should be understood that any alleviation of quarantine measures would lead to increased transmissibility and resumption of an accelerated spread of the virus. To prevent this from happening after the quarantine restrictions have been removed, the viral transmissibility must remain below 1.0. By way of example, the original transmissibility was 2.55 in the case of Italy; therefore, it is necessary that 61% of the Italian population be either infected and then recovered (provided the immunity produced is durable and strong) or vaccinated against SARS-CoV-2 so that when quarantine measures are lifted, the transmissibility remains less than 1.0. At the time of writing this manuscript, 0.33% of the Italian population had been infected according to official statistics [10]. Statistics may not take into account asymptomatic and mild cases of the disease, numbers of which may be 4-50 times higher (for the time being, only by rumor) than that of the officially recorded cases. Even if this were true, the percentage of infected and recovered persons is still significantly lower than necessary, and the removal of quarantine measures will inevitably lead to a return of the growth rate of the number of infected people to almost their original level. In other words, the rapid development, production, and subsequent application of a vaccine are vital to overcoming the COVID-19 crisis in the near future.

Thus, the model allows forecasting of the situation development and concluding about the effectiveness of quarantine measures. By way of example, it helps determine the current number of active infected persons (NA(di)), the approximate date of isolation of the last infected person (dE), and the number of people that could eventually be infected under the current quarantine (NT,max). According to the calculations, the efforts made by many European countries, the United States, and Japan to stop the spread of COVID-19 are not as effective as those implemented previously in China. Most countries have been able to achieve a daily reduction in the number of infected people, but even in these cases, the viral transmissibility remains high enough, which does not allow the country to overcome the epidemic crisis within a reasonable time. At the same time, suppressing the epidemic, albeit slowly, allows time for vaccine development and launch into mass production.

Acknowledgments

This work was carried out within the State Program of A.V. Topchiev Institute of Petrochemical Synthesis.

Conflicts of Interest

None declared.

  1. Brauer F, Castillo-Chávez C. Mathematical Models in Population Biology and Epidemiology. New York: Springer; 2001.
  2. Daley DJ, Gani J. Epidemic Modelling: An Introduction. New York: Cambridge University Press; 2005.
  3. Kermack WO, McKendrick AG. A contribution to the mathematical theory of epidemics. Proc R Soc Lond A 1927 Jan;115(772):700-721. [CrossRef]
  4. Nåsell I. The quasi-stationary distribution of the closed endemic sis model. Adv Appl Probab 2016 Jul 1;28(03):895-932. [CrossRef]
  5. Wang P, Jia J. Stationary distribution of a stochastic SIRD epidemic model of Ebola with double saturated incidence rates and vaccination. Adv Differ Equ 2019 Oct 15;2019(1):1-16. [CrossRef]
  6. Li MY, Muldowney JS. Global stability for the SEIR model in epidemiology. Math Biosci 1995 Feb;125(2):155-164. [CrossRef] [Medline]
  7. Bichara D, Iggidr A, Sallet G. Global analysis of multi-strains SIS, SIR and MSIR epidemic models. J Appl Math Comput 2013 Jun 7;44(1-2):273-292. [CrossRef]
  8. Backer J, Klinkenberg D, Wallinga J. Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China, 20-28 January 2020. Euro Surveill 2020 Feb;25(5):2000062 [FREE Full text] [CrossRef] [Medline]
  9. Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med 2020 Mar 26;382(13):1199-1207 [FREE Full text] [CrossRef] [Medline]
  10. Coronavirus Update (Live). Worldometer.   URL: https://www.worldometers.info/coronavirus [accessed 2020-10-12]

Edited by G Eysenbach; submitted 07.08.20; peer-reviewed by H Akram, J Opoku; comments to author 22.09.20; accepted 28.10.20; published 19.04.21

Copyright

©Sergey O Ilyin. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 19.04.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.