Pathway aggregation for survival prediction via multiple kernel learning

Stat Med. 2018 Jul 20;37(16):2501-2515. doi: 10.1002/sim.7681. Epub 2018 Apr 17.

Abstract

Attempts to predict prognosis in cancer patients using high-dimensional genomic data such as gene expression in tumor tissue can be made difficult by the large number of features and the potential complexity of the relationship between features and the outcome. Integrating prior biological knowledge into risk prediction with such data by grouping genomic features into pathways and networks reduces the dimensionality of the problem and could improve prediction accuracy. Additionally, such knowledge-based models may be more biologically grounded and interpretable. Prediction could potentially be further improved by allowing for complex nonlinear pathway effects. The kernel machine framework has been proposed as an effective approach for modeling the nonlinear and interactive effects of genes in pathways for both censored and noncensored outcomes. When multiple pathways are under consideration, one may efficiently select informative pathways and aggregate their signals via multiple kernel learning (MKL), which has been proposed for prediction of noncensored outcomes. In this paper, we propose MKL methods for censored survival outcomes. We derive our approach for a general survival modeling framework with a convex objective function and illustrate its application under the Cox proportional hazards and semiparametric accelerated failure time models. Numerical studies demonstrate that the proposed MKL-based prediction methods work well in finite sample and can potentially outperform models constructed assuming linear effects or ignoring the group knowledge. The methods are illustrated with an application to 2 cancer data sets.

Keywords: Cox proportional hazards model; accelerated failure time model; kernel machines; multiple kernel learning; risk prediction.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biometry / methods*
  • Computer Simulation
  • Genomics / methods
  • Humans
  • Linear Models
  • Models, Biological
  • Neoplasms / genetics
  • Nonlinear Dynamics
  • Prognosis*
  • Proportional Hazards Models*