Nonstationary multivariate Gaussian processes for electronic health records

J Biomed Inform. 2021 May:117:103698. doi: 10.1016/j.jbi.2021.103698. Epub 2021 Feb 19.

Abstract

Advances in the modeling and analysis of electronic health records (EHR) have the potential to improve patient risk stratification, leading to better patient outcomes. The modeling of complex temporal relations across the multiple clinical variables inherent in EHR data is largely unexplored. Existing approaches to modeling EHR data often lack the flexibility to handle time-varying correlations across multiple clinical variables, or they are too complex for clinical interpretation. Therefore, we propose a novel nonstationary multivariate Gaussian process model for EHR data to address the aforementioned drawbacks of existing methodologies. Our proposed model is able to capture time-varying scale, correlation and smoothness across multiple clinical variables. We also provide details on two inference approaches: Maximum a posteriori and Hamilton Monte Carlo. Our model is validated on synthetic data and then we demonstrate its effectiveness on EHR data from Kaiser Permanente Division of Research (KPDOR). Finally, we use the KPDOR EHR data to investigate the relationships between a clinical patient risk metric and the latent processes of our proposed model and demonstrate statistically significant correlations between these entities.

Keywords: Cross-covariance function; Linear model of coregionalization; Sepsis; Time-varying coefficient.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Electronic Health Records*
  • Humans
  • Normal Distribution