Penalized regression for left-truncated and right-censored survival data

Stat Med. 2021 Nov 10;40(25):5487-5500. doi: 10.1002/sim.9136. Epub 2021 Jul 24.

Abstract

High-dimensional data are becoming increasingly common in the medical field as large volumes of patient information are collected and processed by high-throughput screening, electronic health records, and comprehensive genomic testing. Statistical models that attempt to study the effects of many predictors on survival typically implement feature selection or penalized methods to mitigate the undesirable consequences of overfitting. In some cases survival data are also left-truncated which can give rise to an immortal time bias, but penalized survival methods that adjust for left truncation are not commonly implemented. To address these challenges, we apply a penalized Cox proportional hazards model for left-truncated and right-censored survival data and assess implications of left truncation adjustment on bias and interpretation. We use simulation studies and a high-dimensional, real-world clinico-genomic database to highlight the pitfalls of failing to account for left truncation in survival modeling.

Keywords: Cox model; high-dimensional data; lasso; left truncation; penalized regression; survival analysis.

MeSH terms

  • Bias
  • Computer Simulation
  • Humans
  • Models, Statistical*
  • Proportional Hazards Models