Towards Inferring Network Properties from Epidemic Data
Epidemic propagation on networks represents an important departure from traditional mass-action models. However, the high-dimensionality of the exact models poses a challenge to both mathematical analysis and parameter inference. By using mean-field models, such as the pairwise model (PWM), the high-dimensionality becomes tractable. While such models have been used extensively for model analysis, there is limited work in the context of statistical inference. In this paper, we explore the extent to which the PWM with the susceptible-infected-recovered (SIR) epidemic can be used to infer disease- and network-related parameters. Data from an epidemics can be loosely categorised as being population level, e.g., daily new cases, or individual level, e.g., recovery times. To understand if and how network inference is influenced by the type of data, we employed the widely-used MLE approach for population-level data and dynamical survival analysis (DSA) for individual-level data. For scenarios in which there is no model mismatch, such as when data are generated via simulations, both methods perform well despite strong dependence between parameters. In contrast, for real-world data, such as foot-and-mouth, H1N1 and COVID19, whereas the DSA method appears fairly robust to potential model mismatch and produces parameter estimates that are epidemiologically plausible, our results with the MLE method revealed several issues pertaining to parameter unidentifiability and a lack of robustness to exact knowledge about key quantities such as population size and/or proportion of under reporting. Taken together, however, our findings suggest that network-based mean-field models can be used to formulate approximate likelihoods which, coupled with an efficient inference scheme, make it possible to not only learn about the parameters of the disease dynamics but also that of the underlying network.