Variable selection for case-cohort studies with failure time outcome

Biometrika. 2016 Sep;103(3):547-562. doi: 10.1093/biomet/asw027. Epub 2016 Aug 10.

Abstract

Case-cohort designs are widely used in large cohort studies to reduce the cost associated with covariate measurement. In many such studies the number of covariates is very large, so an efficient variable selection method is necessary. In this paper, we study the properties of a variable selection procedure using the smoothly clipped absolute deviation penalty in a case-cohort design with a diverging number of parameters. We establish the consistency and asymptotic normality of the maximum penalized pseudo-partial-likelihood estimator, and show that the proposed variable selection method is consistent and has an asymptotic oracle property. Simulation studies compare the finite-sample performance of the procedure with tuning parameter selection methods based on the Akaike information criterion and the Bayesian information criterion. We make recommendations for use of the proposed procedures in case-cohort studies, and apply them to the Busselton Health Study.

Keywords: Case-cohort design; Diverging number of parameters; Oracle property; Smoothly clipped absolute deviation; Survival analysis; Variable selection.