Derivation and assessment of risk prediction models using case-cohort data

BMC Med Res Methodol. 2013 Sep 13:13:113. doi: 10.1186/1471-2288-13-113.

Abstract

Background: Case-cohort studies are increasingly used to quantify the association of novel factors with disease risk. Conventional measures of predictive ability need modification for this design. We show how Harrell's C-index, Royston's D, and the category-based and continuous versions of the net reclassification index (NRI) can be adapted.

Methods: We simulated full cohort and case-cohort data, with sampling fractions ranging from 1% to 90%, using covariates from a cohort study of coronary heart disease, and two incidence rates. We then compared the accuracy and precision of the proposed risk prediction metrics.

Results: The C-index and D must be weighted in order to obtain unbiased results. The NRI does not need modification, provided that the relevant non-subcohort cases are excluded from the calculation. The empirical standard errors across simulations were consistent with analytical standard errors for the C-index and D but not for the NRI. Good relative efficiency of the prediction metrics was observed in our examples, provided the sampling fraction was above 40% for the C-index, 60% for D, or 30% for the NRI. Stata code is made available.

Conclusions: Case-cohort designs can be used to provide unbiased estimates of the C-index, D measure and NRI.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Case-Control Studies
  • Computer Simulation
  • Coronary Disease / epidemiology*
  • Coronary Disease / etiology
  • Data Interpretation, Statistical
  • Humans
  • Incidence
  • Models, Statistical
  • Proportional Hazards Models
  • Risk Assessment
  • Risk Factors