Subscriber Authentication Point
 Issue Math. Model. Nat. Phenom. Volume 15, 2020 Coronavirus: Scientific insights and societal aspects 74 13 https://doi.org/10.1051/mmnp/2020050 10 December 2020

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## 1 Introduction

An outbreak of a novel coronavirus disease that causes severe acute respiratory syndrome (SARS-CoV) emerged in December 2019, and was subsequently declared a public health emergency of international concern by the World Health Organization (WHO) on January 30, 2020. Until now, the COVID-19 pandemic has led to more than 16 million confirmed cases and 0.6 million deaths worldwide, which constitutes an unprecedented challenge to the global public health community. Under the increasingly severe situations of epidemic prevention and control in many countries over the world, the epidemiological modeling using surveillance data provides useful mathematical tools for identifying and predicting the disease outbreaks. Particularly important, among others, are the “Susceptible-Exposed-Infectious-Recovered” (SEIR) type models with consideration of the pre-symptomatic period of COVID-19. Wu et al. [15] employed the SEIR model to estimate the basic reproduction number of the outbreak in Wuhan, China. Prem et al. [8] considered a deterministic age-structured SEIR model with quarantine control included as a prevention strategy of contagion. Walker et al. [12] used an age-structured stochastic SEIR model to fit the key parameters of the observed transmission dynamics of COVID-19. Wan et al. [13] analyzed the spread dynamics and trend of COVID-19 in Wuhan using the data since the city’s closure to February 12, 2020. A series of papers [9, 10, 14] developed the quarantined SEIR-type models by imposing restrictions on mobility of patients, with model parameters fitted to the data reported by China CDC using the Markov Chain Monte Carlo (MCMC) method. Under the SEIR framework, Lin et al. [4] adopted a similar compartmental model to capture individual behavioural reaction and governmental actions against COVID-19. Li et al. [3] introduced a networked SEIR-type model with migration data within China, and employed Bayesian inference to infer critical epidemiological characteristics.

Recent months have witnessed the role and importance of mathematical modeling in the decision-making of governments in response to the current pandemic. However, detecting and forecasting the spread of infectious diseases at early stages remain a challenging task due to the lack of model selection criteria or the high computational cost for parameter tuning. Here, we aim at providing a modified SEIR model with asymptomatic and quarantined classes for adaption to the newly emerged COVID-19, and propose a MCMC based method for inferring the key parameters and related transmission characteristics of the COVID-19 pandemic.

## 2 Methods

The data used to fit parameters in this paper are information for confirmed and removed cases in Chinese mainland (23 January to 12 February 2020), France (7 March to 12 April 2020), Germany (7 March to 1 April 2020), Iran (7 March to 31 March 2020), Italy (7 March to 12 April 2020), Japan (20 March to 12 April 2020), South Korea (16 February to 9 March 2020), Spain (7 March to 12 April 2020), United Kingdom (7 March to 12 April 2020), and the United States (7 March to 16 June 2020). Only the Chinese mainland and South Korea datasets are used in the main section to illustrate the results, but the other results are available in Table 2.

We first illustrate the SEIR-type dynamic model under quarantined measures that is based on the previous works [5, 9, 10, 14], as shown in Figure 1. In particular, the infectious population is divided into four compartments: the unquarantined infectious individuals with symptoms (I), the undetected asymptomatic carriers (A), the quarantined infectious individuals with symptoms (Iq), and the detected asymptomatic carriers (Aq). Given the evidence showing that COVID-19 is contagious even during the incubation period of infection, we modify the previous SEIR models [5, 9, 10, 14] as follows: (2.1)

where the notation is summarized in Table 1.

Note that we exclude, without loss of generality, the quarantined susceptible individuals (Sq) from our SEIR modelby assuming that they constitute a subpopulation isolated from the reaction mixture during the epidemic spread. Therefore we divide the total population into two parts: the quarantined susceptible population and an effective population. Here, we regard Ne as the effective population size, which can be obtained using the fraction p of the total population during the epidemic outbreak, Ne = pNtotal, where Ntotal is the total population size. In our numerical simulations, the effective population size Ne appears as a model parameter to be determined. By substituting S = NeEEqIIqAAqR1R2, the model (2.1) can be written as (2.2)

Specifically, we adopted a two-phase removal rate for the quarantined symptomatic population as follows: (2.3)

where z1 and z2 are parameters controlling both low and high levels of the removal rate, and a and b are parameters controlling the position and steepness of the transition between the low and high rates. An intuition behind this choice is as follows. The cure rate could be generally low at the early stage due to the novelty of the COVID-19 disease, gradually improved after a buffering period, and eventually saturated as the pandemic crisis unfolds. Our numerical results show that this heuristic removal rate significantly improves the reproduction of observed surveillance data and enhances the prediction of the COVID-19 outbreaks. We note that the prediction accuracy does not sensibly depend on the specific form of , and other conceivable protocols (e.g., sigmoid function) also generate similar results.

To infer the key transmission characteristics of COVID-19 including the effective reproduction number, the asymptomatic ratio, the mean latent period, and the inflection points, we fit the parameters of model (2.2) using the surveillance data collected from China CDC (http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm) and Johns Hopkins University (https://github.com/CSSEGISandData/COVID-19). Specifically, denote by Ĩq (t) and the number of confirmed and removed cases of COVID-19 on the day t in the surveillance period, and the least squares loss function for optimization of our SEIR model is then written as follows: (2.4)

where T is the surveillance period available for inference of our model parameters, Îq (t;Θ), Âq (t;Θ) and denotes the colspanonding numerical solution to equation (2.2) during the surveillance period using the fourth-order Runge-Kutta method with the epidemic parameters and initial condition collectively denoted by Θ = [α, β, ε, η, γI, γA, . See Table 1 for more detailed description. To solve the high-dimensional optimization problem in an effective manner, we perform the MCMC sampling [9, 10] of the parameter space using the MATLAB toolbox MCMCSTAT [2], as shown in Algorithm 1. In our optimization procedure, we randomly generate 20 sets of initial values in a reasonable range around the parameters, and the proposed algorithm selects the best initial value leading to the least sum of squared deviations from the surveillance data. We further numerically examine the initial-value sensitivity (IVS) of our model predictions, finding that a considerable fraction of parameters share almost the same optimal initial values (which we called “generic initial estimation”, see Table 4) across different demographic regions, possibly and partially due to the common epidemiological properties of the COVID-19 dynamics. This robust IVS of our model significantly alleviates the difficulty of the inference problem and renders the estimation process of (non-generic) parameters computationally tractable. It is noteworthy that our results derived from local optima of a loss function subject to specific initial values of the (non-generic) parameters only provide suboptimal, rather than an optimal, predictions, but numerical experiments demonstrate the potential of our method in forecasting of the COVID-19 outbreaks in practice.

 Fig. 1SEIR model with quarantined measures.
Table 1

Full list of notations used in the SEIR model.

## 3 Results

We first apply our method to predict the inflection time with the largest single-day number of confirmed cases during the outbreaks of COVID-19 in Chinese mainland and South Korea, respectively. Considering that the asymptomatic ratio of confirmed cases is very small in the surveillance period [1], we set χ = 0 when fitting with data reported by China CDC. But in other regions, we set χ = 1. Figures 2 and 3 plot the simulated populations of quarantined infectious and quarantined removed individuals using the inferred parameters from the surveillance data (also see Tab. 2). Here we set kmax= 1000.

With the estimated parameters and simulated time course of epidemics, one can readily obtain several key transmission characteristics of COVID-19, such as the inflection time , the mean latent period , the asymptomatic ratio and the effective reproduction number defined as follows (see Appendix for detailed derivation): (3.1)

In the surveillance period, the average ranges from 1.74 to 3.28, and the mean latent period is relatively short in the Chinese mainland, Japan, and South Korea, while the average does not exceed ten days across the surveillance regions. Moreover, the real values of the inflection point and the number of existing confirmed cases at the inflection point fall within the confidence intervals. See Tables 2 for the detailed estimation results using the surveillance data across different demographic regions. To explore the proportion of cases with asymptomatic infections in the surveillance period, we create the box plots for the data of asymptomatic ratio pA (t) at t = 7, see Figure 4. These plots show that the asymptomatic ratio of the United States exhibits a more obvious increment to the other ones, and the standard deviation of the prediction to the asymptomatic ratio of Japan is relatively high.

Table 2

The estimation of key transmission characteristics across different demographic regions.

 Fig. 2(a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID-19 outbreak in Chinese mainland. The circles represent the surveillance data by the China CDC from January 23 to June 16 with different colors indicating the reported data before and after February 12, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population.
 Fig. 3(a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID-19 outbreak in South Korea. The circles represent the surveillance data by Johns Hopkins University from February 16 to June 16 with different colors indicating the reported data before and after March 9, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population.
 Fig. 4Box plots of asymptomatic ratios pA(t) at t = 7.

## 4 Discussion

In this paper, we proposed a modified SEIR model with both quarantined and asymptotic populations as well as infection by exposed individuals for fitting the observed transmission dynamics of COVID-19, thus enabling an accurate short-term prediction of the infectious disease and providing implications for outbreak control. Several important epidemiological parameters represented by the effective reproduction numbers across different demographic regions were also calculated, which are close to the result obtained in [16]. A recent review [6] reported that an average estimation for is around 3.28, which is consistent with our findings (see Tab. 2).

Our prediction of the asymptomatic ratio of COVID-19 also agrees basically with the medical evidence [7] based estimates of the pandemic. From Table 3, we obtain the mean value of , that is, the transmission efficiency of asymptomatic cases is average less than 50% that of symptomatic individuals. Reported by [3], the transmission rate of every undocumented infectious individual was 55% the transmission rate of documented infectious case. Further, thanks to the high quarantine rate and detection rate (see and in Tab. 3), which benefits from the efficiency of contacts tracing, Chinese mainland and South Korea have already brought the epidemic under control. In contrast to these two regions, the detection rate of asymptomatic carriers was much low, and the epidemic situation of the United States seems to have been more severe.

On the other hand, since the early recovery rate in these high-prevalence areas is generally low because of the long incubation period and high incidence of COVID-19 within aged population, it is therefore imminent to continuously strengthen prevention and control measures and improve medical standards. According to the current result, strengthening the prevention and control measures can slow the development of the epidemic.

To generalize this work, we need more time-dependent parameters instead of constant ones. In the present situation, possibly unsteady infection prevention and control measures may induce that our prediction lacks precision. Besides, due to the local convergence of the algorithm, we need to find the appropriate initial estimation to accelerate the process of fitting the reported data. In future studies, it would be important to design a more effective algorithm for this kind of high-dimensional optimization problem.

Table 3

The average estimation of several parameters across different demographic regions.

Table 4

The reference value of the initial parameter estimation to the optimization problem.

## Appendix A Derivation of equation (3.1)

In this appendix we provide detailed derivation of the basic production number using the next-generation matrix approach [11]. At the disease-free steady state S = Ne, we consider the following closed linearized infection subsystem of the model (2.1):

If we set, where the symbol ⊤ denotes transpose, the subsystem can be rewritten in a vector form as follows:

where F and V represent the transmission and transition matrices of the linearized infection subsystem, respectively. More specifically,

and

Then from the next-generation matrix given by

we have the basic reproduction number as follows:

where ∥⋅∥ denotes the matrix spectral norm.

## Acknowledgements

This work was supported by the National Natural Science Foundation of China (11871343) and JSPS KAKENHI (15H05707). QG thanks D. Gao and X. Pan for their valuable advice. KA thanks H. Toshiyoshi for his valuable discussion.

## References

1. Epidemiology Working Group for NCIP Epidemic Response, The epidemiological characteristics of an outbreakof 2019 novel coronavirus diseases (COVID-19) in China. Chin. J. Epidemiol. 41 (2020) 145–151. [Google Scholar]
2. H. Haario, M. Laine, A. Mira and E. Saksman, DRAM: efficient adaptive MCMC. Stat. Comput. 16 (2006) 339–354. [Google Scholar]
3. R. Li, S. Pei, B. Chen, Y. Song, T. Zhang, W. Yang and J. Shaman, Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 368 (2020) 489–493. [Google Scholar]
4. Q. Lin, S. Zhao, D. Gao, Y. Lou, S. Yang, S.S. Musa et al., A conceptual model for the coronavirusdisease 2019 (COVID-19) outbreak in Wuhan, China with individual reaction and governmental action. Int. J. Infect. Dis. 93 (2020) 211–216. [CrossRef] [PubMed] [Google Scholar]
5. M. Lipsitch, Transmission dynamics and control of severe acute respiratory syndrome. Science 300 (2003) 1966–1970. [Google Scholar]
6. Y. Liu, A.A. Gayle, A. Wilder-Smith and J. Rocklöv, The reproductive number of COVID-19 is higher compared to SARS coronavirus. J. Travel Med. 27 (2020). [Google Scholar]
7. K. Mizumoto, K. Kagaya, A. Zarebski and G. Chowell, Estimating the asymptomatic proportion of coronavirusdisease 2019 (COVID-19) cases on board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Eurosurveillance 25 (2020) 2000180. [CrossRef] [Google Scholar]
8. K. Prem, Y. Liu, T.W. Russell, A.J. Kucharski, R.M. Eggo, N. Davies et al., The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. Lancet Public Health 5 (2020) E261–E270. [CrossRef] [PubMed] [Google Scholar]
9. B. Tang, N.L. Bragazzi, Q. Li, S. Tang, Y. Xiao and J. Wu, An updated estimation of the risk of transmission of the novel coronavirus (2019-nCov). Infect. Dis. Modell. 5 (2020) 248–255. [CrossRef] [Google Scholar]
10. B. Tang, X. Wang, Q. Li, N.L. Bragazzi, S. Tang, Y. Xiao and J. Wu, Estimation of the transmission risk of the 2019-nCoV and its implication for public health interventions. J. Clin. Med. 9 (2020) 462. [Google Scholar]
11. P. van den Driessche and J. Watmough, Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math. Biosci. 180 (2002) 29–48. [Google Scholar]
12. P.G. Walker, C. Whittaker, O. Watson, M. Baguelin, K.E.C. Ainslie, S. Bhatia et al., Report 12: The global impact of COVID-19 and strategies for mitigation and suppression. Technical report, Imperial College London (2020). https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gidafellowships/Imperial-College-COVID19-Global-Impact-26-03-2020v2.pdf. [Google Scholar]
13. K. Wan, J. Chen, C. Lu, L. Dong, Z. Wu and L. Zhang, When will the battle against novel coronavirus end in Wuhan: A SEIR modeling analysis. J. Glob. Health 10 (2020) 011002. [CrossRef] [PubMed] [Google Scholar]
14. Y. Wei, Z. Lu, Z. Du, Z. Zhang, Y. Zhao, S. Shen et al., Fitting and forecasting the trend of COVID-19 by seir (+ caq) dynamic model. Chin. J. Epidemiol. 41 (2020) 470–475. [Google Scholar]
15. J.T. Wu, K. Leung and G.M. Leung, Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study. Lancet 395 (2020) 689–697. [CrossRef] [PubMed] [Google Scholar]
16. S. Zhao, Q. Lin, J. Ran, S.S. Musa, G. Yang, W. Wang et al., Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China,from 2019 to 2020: A data-driven analysis in the early phase of the outbreak. Int. J. Infect. Dis. 92 (2020) 214–217. [CrossRef] [PubMed] [Google Scholar]

## All Tables

Table 1

Full list of notations used in the SEIR model.

Table 2

The estimation of key transmission characteristics across different demographic regions.

Table 3

The average estimation of several parameters across different demographic regions.

Table 4

The reference value of the initial parameter estimation to the optimization problem.

## All Figures

 Fig. 1SEIR model with quarantined measures. In the text
 Fig. 2(a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID-19 outbreak in Chinese mainland. The circles represent the surveillance data by the China CDC from January 23 to June 16 with different colors indicating the reported data before and after February 12, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. In the text
 Fig. 3(a) Populations of the quarantined infectious and the quarantined removed individuals as functions of elapsed time since the COVID-19 outbreak in South Korea. The circles represent the surveillance data by Johns Hopkins University from February 16 to June 16 with different colors indicating the reported data before and after March 9, and the colored area represents 95% CI of the model prediction using the parameters inferred by Algorithm 1. (b) Inferred removal rate of the quarantined symptomatic population. In the text
 Fig. 4Box plots of asymptomatic ratios pA(t) at t = 7. In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.