Issue 
Math. Model. Nat. Phenom.
Volume 15, 2020
Coronavirus: Scientific insights and societal aspects



Article Number  74  
Number of page(s)  13  
DOI  https://doi.org/10.1051/mmnp/2020050  
Published online  10 December 2020 
Inferring key epidemiological parameters and transmission dynamics of COVID19 based on a modified SEIR model
^{1}
Department of Mathematics, Shanghai Normal University,
Shanghai
200234,
P.R. China.
^{2}
School of Information Engineering, Zhengzhou University,
Zhengzhou
450052,
P.R. China.
^{3}
International Research Center for Neurointelligence (WPIIRCN), The University of Tokyo,
731 Hongo,
Bunkyoku,
Tokyo,
1130033, Japan.
^{*} Corresponding author: qguo@shnu.edu.cn
Received:
6
May
2020
Accepted:
23
November
2020
This study aims to establish a modelbased framework for inferring key transmission characteristics of the newly emerging outbreak of the coronavirus disease 2019 (COVID19), especially the epidemic dynamics under quarantine conditions. Inspired by the shifting therapeutic levels and capacity at different stages of the COVID19 pandemic, we propose a modified SEIR model with a twophase removal rate of quarantined hosts undergoing continuously tunable transition. We employ the Markov Chain Monte Carlo (MCMC) approach for inferring and forecasting the epidemiological dynamics from the publicly available surveillance reports. The effectiveness of a shortterm prediction is illustrated by adopting the data sets from 10 demographic regions including Chinese mainland and South Korea. In the surveillance period, the average R_{0} ranges from 1.74 to 3.28, and the median of the mean latent period does not exceed 10 days across the surveillance regions.
Mathematics Subject Classification: 2D30
Key words: COVID19 / SEIR model / effective reproduction number / asymptomatic ratio / mean latent period
© The authors. Published by EDP Sciences, 2020
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Introduction
An outbreak of a novel coronavirus disease that causes severe acute respiratory syndrome (SARSCoV) emerged in December 2019, and was subsequently declared a public health emergency of international concern by the World Health Organization (WHO) on January 30, 2020. Until now, the COVID19 pandemic has led to more than 16 million confirmed cases and 0.6 million deaths worldwide, which constitutes an unprecedented challenge to the global public health community. Under the increasingly severe situations of epidemic prevention and control in many countries over the world, the epidemiological modeling using surveillance data provides useful mathematical tools for identifying and predicting the disease outbreaks. Particularly important, among others, are the “SusceptibleExposedInfectiousRecovered” (SEIR) type models with consideration of the presymptomatic period of COVID19. Wu et al. [15] employed the SEIR model to estimate the basic reproduction number of the outbreak in Wuhan, China. Prem et al. [8] considered a deterministic agestructured SEIR model with quarantine control included as a prevention strategy of contagion. Walker et al. [12] used an agestructured stochastic SEIR model to fit the key parameters of the observed transmission dynamics of COVID19. Wan et al. [13] analyzed the spread dynamics and trend of COVID19 in Wuhan using the data since the city’s closure to February 12, 2020. A series of papers [9, 10, 14] developed the quarantined SEIRtype models by imposing restrictions on mobility of patients, with model parameters fitted to the data reported by China CDC using the Markov Chain Monte Carlo (MCMC) method. Under the SEIR framework, Lin et al. [4] adopted a similar compartmental model to capture individual behavioural reaction and governmental actions against COVID19. Li et al. [3] introduced a networked SEIRtype model with migration data within China, and employed Bayesian inference to infer critical epidemiological characteristics.
Recent months have witnessed the role and importance of mathematical modeling in the decisionmaking of governments in response to the current pandemic. However, detecting and forecasting the spread of infectious diseases at early stages remain a challenging task due to the lack of model selection criteria or the high computational cost for parameter tuning. Here, we aim at providing a modified SEIR model with asymptomatic and quarantined classes for adaption to the newly emerged COVID19, and propose a MCMC based method for inferring the key parameters and related transmission characteristics of the COVID19 pandemic.
2 Methods
The data used to fit parameters in this paper are information for confirmed and removed cases in Chinese mainland (23 January to 12 February 2020), France (7 March to 12 April 2020), Germany (7 March to 1 April 2020), Iran (7 March to 31 March 2020), Italy (7 March to 12 April 2020), Japan (20 March to 12 April 2020), South Korea (16 February to 9 March 2020), Spain (7 March to 12 April 2020), United Kingdom (7 March to 12 April 2020), and the United States (7 March to 16 June 2020). Only the Chinese mainland and South Korea datasets are used in the main section to illustrate the results, but the other results are available in Table 2.
We first illustrate the SEIRtype dynamic model under quarantined measures that is based on the previous works [5, 9, 10, 14], as shown in Figure 1. In particular, the infectious population is divided into four compartments: the unquarantined infectious individuals with symptoms (I), the undetected asymptomatic carriers (A), the quarantined infectious individuals with symptoms (I_{q}), and the detected asymptomatic carriers (A_{q}). Given the evidence showing that COVID19 is contagious even during the incubation period of infection, we modify the previous SEIR models [5, 9, 10, 14] as follows: (2.1)
where the notation is summarized in Table 1.
Note that we exclude, without loss of generality, the quarantined susceptible individuals (S_{q}) from our SEIR modelby assuming that they constitute a subpopulation isolated from the reaction mixture during the epidemic spread. Therefore we divide the total population into two parts: the quarantined susceptible population and an effective population. Here, we regard N_{e} as the effective population size, which can be obtained using the fraction p of the total population during the epidemic outbreak, N_{e} = pN_{total}, where N_{total} is the total population size. In our numerical simulations, the effective population size N_{e} appears as a model parameter to be determined. By substituting S = N_{e} − E − E_{q} − I − I_{q} − A − A_{q} − R_{1} − R_{2}, the model (2.1) can be written as (2.2)
Specifically, we adopted a twophase removal rate for the quarantined symptomatic population as follows: (2.3)
where z_{1} and z_{2} are parameters controlling both low and high levels of the removal rate, and a and b are parameters controlling the position and steepness of the transition between the low and high rates. An intuition behind this choice is as follows. The cure rate could be generally low at the early stage due to the novelty of the COVID19 disease, gradually improved after a buffering period, and eventually saturated as the pandemic crisis unfolds. Our numerical results show that this heuristic removal rate significantly improves the reproduction of observed surveillance data and enhances the prediction of the COVID19 outbreaks. We note that the prediction accuracy does not sensibly depend on the specific form of , and other conceivable protocols (e.g., sigmoid function) also generate similar results.
To infer the key transmission characteristics of COVID19 including the effective reproduction number, the asymptomatic ratio, the mean latent period, and the inflection points, we fit the parameters of model (2.2) using the surveillance data collected from China CDC (http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm) and Johns Hopkins University (https://github.com/CSSEGISandData/COVID19). Specifically, denote by Ĩ_{q} (t) and the number of confirmed and removed cases of COVID19 on the day t in the surveillance period, and the least squares loss function for optimization of our SEIR model is then written as follows: (2.4)
where T is the surveillance period available for inference of our model parameters, Î_{q} (t;Θ), Â_{q} (t;Θ) and denotes the colspanonding numerical solution to equation (2.2) during the surveillance period using the fourthorder RungeKutta method with the epidemic parameters and initial condition collectively denoted by Θ = [α, β, ε, η, γ_{I}, γ_{A}, . See Table 1 for more detailed description. To solve the highdimensional optimization problem in an effective manner, we perform the MCMC sampling [9, 10] of the parameter space using the MATLAB toolbox MCMCSTAT [2], as shown in Algorithm 1. In our optimization procedure, we randomly generate 20 sets of initial values in a reasonable range around the parameters, and the proposed algorithm selects the best initial value leading to the least sum of squared deviations from the surveillance data. We further numerically examine the initialvalue sensitivity (IVS) of our model predictions, finding that a considerable fraction of parameters share almost the same optimal initial values (which we called “generic initial estimation”, see Table 4) across different demographic regions, possibly and partially due to the common epidemiological properties of the COVID19 dynamics. This robust IVS of our model significantly alleviates the difficulty of the inference problem and renders the estimation process of (nongeneric) parameters computationally tractable. It is noteworthy that our results derived from local optima of a loss function subject to specific initial values of the (nongeneric) parameters only provide suboptimal, rather than an optimal, predictions, but numerical experiments demonstrate the potential of our method in forecasting of the COVID19 outbreaks in practice.
Fig. 1 SEIR model with quarantined measures. 
Full list of notations used in the SEIR model.
3 Results
We first apply our method to predict the inflection time with the largest singleday number of confirmed cases during the outbreaks of COVID19 in Chinese mainland and South Korea, respectively. Considering that the asymptomatic ratio of confirmed cases is very small in the surveillance period [1], we set χ = 0 when fitting with data reported by China CDC. But in other regions, we set χ = 1. Figures 2 and 3 plot the simulated populations of quarantined infectious and quarantined removed individuals using the inferred parameters from the surveillance data (also see Tab. 2). Here we set k_{max}= 1000.
With the estimated parameters and simulated time course of epidemics, one can readily obtain several key transmission characteristics of COVID19, such as the inflection time , the mean latent period , the asymptomatic ratio and the effective reproduction number defined as follows (see Appendix for detailed derivation): (3.1)
In the surveillance period, the average ranges from 1.74 to 3.28, and the mean latent period is relatively short in the Chinese mainland, Japan, and South Korea, while the average does not exceed ten days across the surveillance regions. Moreover, the real values of the inflection point and the number of existing confirmed cases at the inflection point fall within the confidence intervals. See Tables 2 for the detailed estimation results using the surveillance data across different demographic regions. To explore the proportion of cases with asymptomatic infections in the surveillance period, we create the box plots for the data of asymptomatic ratio p_{A} (t) at t = 7, see Figure 4. These plots show that the asymptomatic ratio of the United States exhibits a more obvious increment to the other ones, and the standard deviation of the prediction to the asymptomatic ratio of Japan is relatively high.
The estimation of key transmission characteristics across different demographic regions.
Fig. 2 (a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID19 outbreak in Chinese mainland. The circles represent the surveillance data by the China CDC from January 23 to June 16 with different colors indicating the reported data before and after February 12, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. 
Fig. 3 (a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID19 outbreak in South Korea. The circles represent the surveillance data by Johns Hopkins University from February 16 to June 16 with different colors indicating the reported data before and after March 9, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. 
Fig. 4 Box plots of asymptomatic ratios p_{A}(t) at t = 7. 
4 Discussion
In this paper, we proposed a modified SEIR model with both quarantined and asymptotic populations as well as infection by exposed individuals for fitting the observed transmission dynamics of COVID19, thus enabling an accurate shortterm prediction of the infectious disease and providing implications for outbreak control. Several important epidemiological parameters represented by the effective reproduction numbers across different demographic regions were also calculated, which are close to the result obtained in [16]. A recent review [6] reported that an average estimation for is around 3.28, which is consistent with our findings (see Tab. 2).
Our prediction of the asymptomatic ratio of COVID19 also agrees basically with the medical evidence [7] based estimates of the pandemic. From Table 3, we obtain the mean value of , that is, the transmission efficiency of asymptomatic cases is average less than 50% that of symptomatic individuals. Reported by [3], the transmission rate of every undocumented infectious individual was 55% the transmission rate of documented infectious case. Further, thanks to the high quarantine rate and detection rate (see and in Tab. 3), which benefits from the efficiency of contacts tracing, Chinese mainland and South Korea have already brought the epidemic under control. In contrast to these two regions, the detection rate of asymptomatic carriers was much low, and the epidemic situation of the United States seems to have been more severe.
On the other hand, since the early recovery rate in these highprevalence areas is generally low because of the long incubation period and high incidence of COVID19 within aged population, it is therefore imminent to continuously strengthen prevention and control measures and improve medical standards. According to the current result, strengthening the prevention and control measures can slow the development of the epidemic.
To generalize this work, we need more timedependent parameters instead of constant ones. In the present situation, possibly unsteady infection prevention and control measures may induce that our prediction lacks precision. Besides, due to the local convergence of the algorithm, we need to find the appropriate initial estimation to accelerate the process of fitting the reported data. In future studies, it would be important to design a more effective algorithm for this kind of highdimensional optimization problem.
The average estimation of several parameters across different demographic regions.
The reference value of the initial parameter estimation to the optimization problem.
Appendix A Derivation of equation (3.1)
In this appendix we provide detailed derivation of the basic production number using the nextgeneration matrix approach [11]. At the diseasefree steady state S = N_{e}, we consider the following closed linearized infection subsystem of the model (2.1):
If we set, where the symbol ⊤ denotes transpose, the subsystem can be rewritten in a vector form as follows:
where F and V represent the transmission and transition matrices of the linearized infection subsystem, respectively. More specifically,
Then from the nextgeneration matrix given by
we have the basic reproduction number as follows:
where ∥⋅∥ denotes the matrix spectral norm.
Appendix B Detailed description of Algorithm 1
Acknowledgements
This work was supported by the National Natural Science Foundation of China (11871343) and JSPS KAKENHI (15H05707). QG thanks D. Gao and X. Pan for their valuable advice. KA thanks H. Toshiyoshi for his valuable discussion.
References
 Epidemiology Working Group for NCIP Epidemic Response, The epidemiological characteristics of an outbreakof 2019 novel coronavirus diseases (COVID19) in China. Chin. J. Epidemiol. 41 (2020) 145–151. [Google Scholar]
 H. Haario, M. Laine, A. Mira and E. Saksman, DRAM: efficient adaptive MCMC. Stat. Comput. 16 (2006) 339–354. [Google Scholar]
 R. Li, S. Pei, B. Chen, Y. Song, T. Zhang, W. Yang and J. Shaman, Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARSCoV2). Science 368 (2020) 489–493. [Google Scholar]
 Q. Lin, S. Zhao, D. Gao, Y. Lou, S. Yang, S.S. Musa et al., A conceptual model for the coronavirusdisease 2019 (COVID19) outbreak in Wuhan, China with individual reaction and governmental action. Int. J. Infect. Dis. 93 (2020) 211–216. [CrossRef] [PubMed] [Google Scholar]
 M. Lipsitch, Transmission dynamics and control of severe acute respiratory syndrome. Science 300 (2003) 1966–1970. [Google Scholar]
 Y. Liu, A.A. Gayle, A. WilderSmith and J. Rocklöv, The reproductive number of COVID19 is higher compared to SARS coronavirus. J. Travel Med. 27 (2020). [Google Scholar]
 K. Mizumoto, K. Kagaya, A. Zarebski and G. Chowell, Estimating the asymptomatic proportion of coronavirusdisease 2019 (COVID19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance 25 (2020) 2000180. [CrossRef] [Google Scholar]
 K. Prem, Y. Liu, T.W. Russell, A.J. Kucharski, R.M. Eggo, N. Davies et al., The effect of control strategies to reduce social mixing on outcomes of the COVID19 epidemic in Wuhan, China: a modelling study. Lancet Public Health 5 (2020) E261–E270. [CrossRef] [PubMed] [Google Scholar]
 B. Tang, N.L. Bragazzi, Q. Li, S. Tang, Y. Xiao and J. Wu, An updated estimation of the risk of transmission of the novel coronavirus (2019nCov). Infect. Dis. Modell. 5 (2020) 248–255. [CrossRef] [Google Scholar]
 B. Tang, X. Wang, Q. Li, N.L. Bragazzi, S. Tang, Y. Xiao and J. Wu, Estimation of the transmission risk of the 2019nCoV and its implication for public health interventions. J. Clin. Med. 9 (2020) 462. [Google Scholar]
 P. van den Driessche and J. Watmough, Reproduction numbers and subthreshold endemic equilibria for compartmental models of disease transmission. Math. Biosci. 180 (2002) 29–48. [Google Scholar]
 P.G. Walker, C. Whittaker, O. Watson, M. Baguelin, K.E.C. Ainslie, S. Bhatia et al., Report 12: The global impact of COVID19 and strategies for mitigation and suppression. Technical report, Imperial College London (2020). https://www.imperial.ac.uk/media/imperialcollege/medicine/sph/ide/gidafellowships/ImperialCollegeCOVID19GlobalImpact26032020v2.pdf. [Google Scholar]
 K. Wan, J. Chen, C. Lu, L. Dong, Z. Wu and L. Zhang, When will the battle against novel coronavirus end in Wuhan: A SEIR modeling analysis. J. Glob. Health 10 (2020) 011002. [CrossRef] [PubMed] [Google Scholar]
 Y. Wei, Z. Lu, Z. Du, Z. Zhang, Y. Zhao, S. Shen et al., Fitting and forecasting the trend of COVID19 by seir (+ caq) dynamic model. Chin. J. Epidemiol. 41 (2020) 470–475. [Google Scholar]
 J.T. Wu, K. Leung and G.M. Leung, Nowcasting and forecasting the potential domestic and international spread of the 2019nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 395 (2020) 689–697. [CrossRef] [PubMed] [Google Scholar]
 S. Zhao, Q. Lin, J. Ran, S.S. Musa, G. Yang, W. Wang et al., Preliminary estimation of the basic reproduction number of novel coronavirus (2019nCoV) in China,from 2019 to 2020: A datadriven analysis in the early phase of the outbreak. Int. J. Infect. Dis. 92 (2020) 214–217. [CrossRef] [PubMed] [Google Scholar]
All Tables
The estimation of key transmission characteristics across different demographic regions.
The average estimation of several parameters across different demographic regions.
The reference value of the initial parameter estimation to the optimization problem.
All Figures
Fig. 1 SEIR model with quarantined measures. 

In the text 
Fig. 2 (a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID19 outbreak in Chinese mainland. The circles represent the surveillance data by the China CDC from January 23 to June 16 with different colors indicating the reported data before and after February 12, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. 

In the text 
Fig. 3 (a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID19 outbreak in South Korea. The circles represent the surveillance data by Johns Hopkins University from February 16 to June 16 with different colors indicating the reported data before and after March 9, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. 

In the text 
Fig. 4 Box plots of asymptomatic ratios p_{A}(t) at t = 7. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.