In this section, we briefly review the epidemic SIR model on contact networks (Youssef and Scoglio 2011; Prasse and Van Mieghem 2020b) and the prediction of the COVID-19 infection, caused by the SARS-CoV-2 virus, based on the SIR model (Prasse et al. 2020). Then, we incorporate time-varying protocols introduced by the government to slow down the virus propagation.
We consider a network with N nodes, where each node i corresponds to the set of individuals living in the same place, like a city or a region. An individual at any discrete time \(k=1, 2, \ldots \) is in either one of the \(c=3\) compartments Susceptible (S), Infectious (I), Recovered (R). The SIR model assumes that infectious individuals become recovered and cannot infect any longer because of hospitalization, death, or quarantine measures. The viral state of any node i at time k is denoted by the \(3 \times 1\) vector \(v_i[k] =(S_i[k],I_i[k],R_i[k])^T\), where \(S_i[k],~I_i[k],~R_i[k]\) are the fractions of susceptible, infectious, and recovered individuals, respectively, satisfying the conservation law \(S_i[k]+I_i[k]+R_i[k]=1\). The discrete-time SIR model (Youssef and Scoglio 2011; Prasse and Van Mieghem 2020b) defines the evolution of the viral state \(v_i[k]\) of each node i as:
$$\begin{aligned} I_i[k+1]= & {} (1 - \delta _i)I_i[k] + (1- I_i[k] - R_i[k] )\sum \limits _{j=1}^N \beta _{ij} I_j[k] \end{aligned}$$
(1)
$$\begin{aligned} R_i[k+1]= & {} R_i[k] + \delta _i I_i[k] \end{aligned}$$
(2)
where \(\beta _{ij}\) denotes the infection probability when individuals move from place (also called region) j to place i. The self-infection probability \(\beta _{ii}\ne 0\), because individuals inside the same place interact. The \(N \times N\) infection probability matrix B specifies the contact transmission chance between each couple of regions. The curing probability \(\delta _i\) of place i quantifies the capability of individuals in place i to cure from the virus. We assume that the SIR model (1), (2) has both \(\beta _{ij}\) and \(\delta _i\) that do not change over time.
Prasse et al. (2020) proposed the Network Inference-based Prediction Algorithm (NIPA), which estimates the spreading parameters \(\delta _i\) and \(\beta _{ij}\) for each region i from the time series \(v_i[1], v_i[2], \ldots , v_i[n]\). These estimates in (1) and (2) predict the evolution of the virus in the next future times \(k>n\).
The SIR model has three compartments. In principle, with c compartments, we must have \(c-1\) independent measurements. The input to NIPA is only one compartment, the infectious compartment I, which is less than c \(- 1=2\) compartments necessary to reconstruct the network with the SIR model. NIPA creates observations of the R compartment by iterating over different candidate values of the curing rates \(\delta _i\) and assuming the initial condition R(0) \(=\) 0. Thus, we observe only one compartment, the infectious compartment I, and the recovered compartment R is obtained by Eq. (2) after estimating the curing probability \(\delta _i\) in the training phase.
To obtain the curing probability \(\delta _i\), 50 equidistant values between \(\delta _{min}\) and \(\delta _{max}\) have been considered, and then the value giving the best fit of model (1) has been used to estimate the matrix B based on the least absolute shrinkage and selection operator (LASSO). For a general class of dynamics on networks (including the SIR model), completely different network topologies can result in the same dynamics. Hence, it is not possible to deduce the network accurately from observations, regardless of the reconstruction method: two very different networks perfectly match the observations, and there is no reason to infer one network instead of the other. Thus, though NIPA accurately predicts the dynamics, the estimated network B can be very different from the true network (Prasse and Van Mieghem 2020c).
Let n be the number of days in which the infection has been observed. To evaluate the prediction accuracy, a fixed number of days \(n_{neglect}\) is removed prior to \(v_i[1], v_i[2], \ldots , v_i[n]\). The model is then trained on the days \(v_i[1], v_i[2], \ldots , v_i[n- n_{neglect}]\). Thereafter, the omitted \(n_{neglect}\) days (\(k=n- n_{neglect}+1, \ldots , n\)) are predicted. It is possible to predict also \(n_{predict}\) days (\(k=n+1, \ldots , n+n_{predict}\)) ahead the number n of available observations, however, in such a case, we cannot evaluate the goodness of the prediction.
Prasse et al. (2020) showed that the approach accurately predicts the cumulative infections for \(n_{neglect} \le 5\). However, if the number of neglected days increases, then the prediction capability of NIPA decreases. NIPA assumes constant values for \(\beta _{ij}\), which, however, do not reflect the reality of the COVID-19 pandemic, because the containment measures imposed by the governments diminish \(\beta _{ij}\) and thus the spread of the infection. Hence, infection probabilities \(\beta _{ij}[k]\) which vary over time k should be considered in the epidemic model.