Temporally informed random forests for suicide risk prediction

J Am Med Inform Assoc. 2021 Dec 28;29(1):62-71. doi: 10.1093/jamia/ocab225.

Abstract

Objective: Suicide is one of the leading causes of death worldwide, yet clinicians find it difficult to reliably identify individuals at high risk for suicide. Algorithmic approaches for suicide risk detection have been developed in recent years, mostly based on data from electronic health records (EHRs). Significant room for improvement remains in the way these models take advantage of temporal information to improve predictions.

Materials and methods: We propose a temporally enhanced variant of the random forest (RF) model-Omni-Temporal Balanced Random Forests (OT-BRFs)-that incorporates temporal information in every tree within the forest. We develop and validate this model using longitudinal EHRs and clinician notes from the Mass General Brigham Health System recorded between 1998 and 2018, and compare its performance to a baseline Naive Bayes Classifier and 2 standard versions of balanced RFs.

Results: Temporal variables were found to be associated with suicide risk: Elevated suicide risk was observed in individuals with a higher total number of visits as well as those with a low rate of visits over time, while lower suicide risk was observed in individuals with a longer period of EHR coverage. RF models were more accurate than Naive Bayesian classifiers at predicting suicide risk in advance (area under the receiver operating curve = 0.824 vs. 0.754, respectively). The proposed OT-BRF model performed best among all RF approaches, yielding a sensitivity of 0.339 at 95% specificity, compared to 0.290 and 0.286 for the other 2 RF models. Temporal variables were assigned high importance by the models that incorporated them.

Discussion: We demonstrate that temporal variables have an important role to play in suicide risk detection and that requiring their inclusion in all RF trees leads to increased predictive performance. Integrating temporal information into risk prediction models helps the models interpret patient data in temporal context, improving predictive performance.

Keywords: clinical risk; modeling; random forest; suicide; temporal.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Electronic Health Records*
  • Humans
  • Risk Assessment
  • Suicide*