Enhancing post-traumatic stress disorder patient assessment: Leveraging Natural Language Processing for Research of Domain Criteria Identification using electronic medical records

Res Sq [Preprint]. 2024 Feb 21:rs.3.rs-3973337. doi: 10.21203/rs.3.rs-3973337/v1.

Abstract

Background: Extracting research of domain criteria (RDoC) from high-risk populations like those with post-traumatic stress disorder (PTSD) is crucial for positive mental health improvements and policy enhancements. The intricacies of collecting, integrating, and effectively leveraging clinical notes for this purpose introduce complexities.

Methods: In our study, we created an NLP workflow to analyze electronic medical record (EMR) data, and identify and extract research of domain criteria using a pre-trained transformer-based natural language model, allmpnet-base-v2. We subsequently built dictionaries from 100,000 clinical notes and analyzed 5.67 million clinical notes from 38,807 PTSD patients from the University of Pittsburgh Medical Center. Subsequently, we showcased the significance of our approach by extracting and visualizing RDoC information in two use cases: (i) across multiple patient populations and (ii) throughout various disease trajectories.

Results: The sentence transformer model demonstrated superior F1 macro scores across all RDoC domains, achieving the highest performance with a cosine similarity threshold value of 0.3. This ensured an F1 score of at least 80% across all RDoC domains. The study revealed consistent reductions in all six RDoC domains among PTSD patients after psychotherapy. Women had the highest abnormalities of sensorimotor systems, while veterans had the highest abnormalities of negative and positive valence systems. The domains following first diagnoses of PTSD were associated with heightened cue reactivity to trauma, suicide, alcohol, and substance consumption.

Conclusions: The findings provide initial insights into RDoC functioning in different populations and disease trajectories. Natural language processing proves valuable for capturing real-time, context dependent RDoC instances from extensive clinical notes.

Keywords: Post-traumatic stress disorder; clinical notes; natural language processing; real-world evidence; research of domain criteria.

Publication types

  • Preprint