Predicting Suicide Among US Veterans Using Natural Language Processing-enriched Social and Behavioral Determinants of Health

Res Sq [Preprint]. 2024 Apr 23:rs.3.rs-4290732. doi: 10.21203/rs.3.rs-4290732/v1.

Abstract

Despite recognizing the critical association between social and behavioral determinants of health (SBDH) and suicide risk, SBDHs from unstructured electronic health record (EHR) notes for suicide predictive modeling remain underutilized. This study investigates the impact of SBDH, identified from both structured and unstructured data utilizing a natural language processing (NLP) system, on suicide prediction within 7, 30, 90, and 180 days of discharge. Using EHR data of 2,987,006 Veterans between October 1, 2009, and September 30, 2015, from the US Veterans Health Administration (VHA), we designed a case-control study that demonstrates that incorporating structured and NLP-extracted SBDH significantly enhances the performance of three architecturally distinct suicide predictive models - elastic-net logistic regression, random forest (RF), and multilayer perceptron. For example, RF achieved notable improvements in suicide prediction within 180 days of discharge, with an increase in the area under the receiver operating characteristic curve from 83.57-84.25% (95% CI = 0.63%-0.98%, p-val < 0.001) and the area under the precision recall curve from 57.38-59.87% (95% CI = 3.86%-4.82%, p-val < 0.001) after integrating NLP-extracted SBDH. These findings underscore the potential of NLP-extracted SBDH in enhancing suicide prediction across various prediction timeframes, offering valuable insights for healthcare practitioners and policymakers.

Publication types

  • Preprint