Mitigating bias in estimating epidemic severity due to heterogeneity of epidemic onset and data aggregation

Ann Epidemiol. 2022 Jan:65:1-14. doi: 10.1016/j.annepidem.2021.07.008. Epub 2021 Aug 19.

Abstract

Outbreaks of infectious diseases, such as influenza, are a major societal burden. Mitigation policies during an outbreak or pandemic are guided by the analysis of data of ongoing or preceding epidemics. The reproduction number, R0, defined as the expected number of secondary infections arising from a single individual in a population of susceptibles is critical to epidemiology. For typical compartmental models such as the Susceptible-Infected-Recovered (SIR) R0 represents the severity of an epidemic. It is an estimate of the early-stage growth rate of an epidemic and is an important threshold parameter used to gain insights into the spread or decay of an outbreak. Models typically use incidence counts as indicators of cases within a single large population; however, epidemic data are the result of a hierarchical aggregation, where incidence counts from spatially separated monitoring sites (or sub-regions) are pooled and used to infer R0. Is this aggregation approach valid when the epidemic has different dynamics across the regions monitored? We characterize bias in the estimation of R0 from a merged data set when the epidemics of the sub-regions, used in the merger, exhibit delays in onset. We propose a method to mitigate this bias, and study its efficacy on synthetic data as well as real-world influenza and COVID-19 data.

Keywords: Aggregation bias; COVID-19; Delays in epidemics; Epidemiology; Influenza; Reproduction number.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Basic Reproduction Number
  • COVID-19*
  • Data Aggregation
  • Disease Outbreaks
  • Epidemics*
  • Epidemiological Models
  • Humans
  • Pandemics
  • SARS-CoV-2