Imputation strategies for missing data in a school-based multi-centre study: the Pathways study

Stat Med. 2001 Jan 30;20(2):305-16. doi: 10.1002/1097-0258(20010130)20:2<305::aid-sim645>3.0.co;2-m.

Abstract

Pathways is a multi-centre school-based trial sponsored by the National Heart, Lung, and Blood Institute testing the efficacy of an obesity prevention intervention in American Indian children. During the study's protocol development, we prepared an analysis plan that accounted for missing data. In this paper, we present a case study of the process we used to decide upon the final analysis plan. The primary endpoint of the Pathways study is a comparison of per cent body fat between treatment and usual care groups at the end of a three-year intervention. Other studies on children and Native Americans have had moderate to large amounts of missing data. As a result we were concerned that missing data in Pathways would affect the type I error rate and power of the test of our primary endpoint. We present results from our evaluation of three alternative procedures in this paper. The first is a multiple imputation procedure in which we replace missing values with resampled values from the observed data. The second is based on the Wilcoxon rank sum test; missing data in the intervention group receive the worst ranks. In the third, we use a multiple imputation procedure and replace missing values with predicted values from a regression equation with the coefficients estimated from observed follow-up data and baseline values. We found that the multiple imputation procedure that replaces missing values with predicted values had the best properties of the procedures we considered. The results from our simulation study showed that, for missing data patterns that are relevant to the Pathways study, this procedure has high power and maintains the type I error rate. Published in 2001 by John Wiley & Sons, Ltd.

MeSH terms

  • Analysis of Variance
  • Body Height
  • Body Weight
  • Child
  • Computer Simulation*
  • Data Interpretation, Statistical*
  • Electric Impedance
  • Endpoint Determination
  • Humans
  • Indians, North American
  • Monte Carlo Method
  • Multicenter Studies as Topic / methods*
  • Obesity / prevention & control*
  • Schools
  • Skinfold Thickness