Comparisons of statistical methods for handling attrition in a follow-up visit with complex survey sampling

Stat Med. 2023 May 20;42(11):1641-1668. doi: 10.1002/sim.9692. Epub 2023 Mar 7.

Abstract

Design-based analysis, which accounts for the design features of the study, is commonly used to conduct data analysis in studies with complex survey sampling, such as the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). In this type of longitudinal study, attrition has often been a problem. Although there have been various statistical approaches proposed to handle attrition, such as inverse probability weighting (IPW), non-response cell weighting (NRCW), multiple imputation (MI), and full information maximum likelihood (FIML) approach, there has not been a systematic assessment of these methods to compare their performance in design-based analyses. In this article, we perform extensive simulation studies and compare the performance of different missing data methods in linear and generalized linear population models, and under different missing data mechanism. We find that the design-based analysis is able to produce valid estimation and statistical inference when the missing data are handled appropriately using IPW, NRCW, MI, or FIML approach under missing-completely-at-random or missing-at-random missing mechanism and when the missingness model is correctly specified or over-specified. We also illustrate the use of these methods using data from HCHS/SOL.

Keywords: attrition; complex survey sampling design; design-based analysis; missing data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Simulation
  • Follow-Up Studies
  • Humans
  • Linear Models
  • Longitudinal Studies
  • Models, Statistical*
  • Probability