Better data for decision-making through Bayesian imputation of suppressed provisional COVID-19 death counts

PLoS One. 2023 Aug 3;18(8):e0288961. doi: 10.1371/journal.pone.0288961. eCollection 2023.

Abstract

Purpose: To facilitate use of timely, granular, and publicly available data on COVID-19 mortality, we provide a method for imputing suppressed COVID-19 death counts in the National Center for Health Statistic's 2020 provisional mortality data by quarter, county, and age.

Methods: We used a Bayesian approach to impute suppressed COVID-19 death counts by quarter, county, and age in provisional data for 3,138 US counties. Our model accounts for multilevel data structures; numerous zero death counts among persons aged <50 years, rural counties, early quarters in 2020; highly right-skewed distributions; and different levels of data granularity (county, state or locality, and national levels). We compared three models with different prior assumptions of suppressed COVID-19 deaths, including noninformative priors (M1), the same weakly informative priors for all age groups (M2), and weakly informative priors that differ by age (M3) to impute the suppressed death counts. After the imputed suppressed counts were available, we assessed three prior assumptions at the national, state/locality, and county level, respectively. Finally, we compared US counties by two types of COVID-19 death rates, crude (CDR) and age-standardized death rates (ASDR), which can be estimated only through imputing suppressed death counts.

Results: Without imputation, the total COVID-19 death counts estimated from the raw data underestimated the reported national COVID-19 deaths by 18.60%. Using imputed data, we overestimated the national COVID-19 deaths by 3.57% (95% CI: 3.37%-3.80%) in model M1, 2.23% (95% CI: 2.04%-2.43%) in model M2, and 2.96% (95% CI: 2.76%-3.16%) in model M3 compared with the national report. The top 20 counties that were most affected by COVID-19 mortality were different between CDR and ASDR.

Conclusions: Bayesian imputation of suppressed county-level, age-specific COVID-19 deaths in US provisional data can improve county ASDR estimates and aid public health officials in identifying disparities in deaths from COVID-19.

MeSH terms

  • Bayes Theorem
  • COVID-19* / epidemiology
  • Humans
  • United States / epidemiology

Grants and funding

The author(s) received no specific funding for this work.