Identifying Patient Phenotype Cohorts Using Prehospital Electronic Health Record Data

Rachel Stemerman; Thomas Bunning; Joseph Grover; Rebecca Kitzmiller; Mehul D Patel

doi:10.1080/10903127.2020.1859658

Identifying Patient Phenotype Cohorts Using Prehospital Electronic Health Record Data

Prehosp Emerg Care. 2021 Jan 25:1-14. doi: 10.1080/10903127.2020.1859658. Online ahead of print.

Authors

Rachel Stemerman¹, Thomas Bunning¹, Joseph Grover¹, Rebecca Kitzmiller¹, Mehul D Patel¹

Affiliation

¹ Received November 19, 2020 from Carolina Health Informatics Program, University of North Carolina, Chapel Hill, North Carolina (RS, RK); Department of Anesthesiology, Duke University Medical Center, Durham, North Carolina (TB); Department of Emergency Medicine, University of North Carolina, Chapel Hill, North Carolina (JG, MDP) Revision received; accepted for publication December 1, 2020.

PMID: 33315497
DOI: 10.1080/10903127.2020.1859658

Abstract

Objective: Emergency medical services (EMS) provide critical interventions for patients with acute illness and injury and are important in implementing prehospital emergency care research. Retrospective, manual patient record review, the current reference-standard for identifying patient cohorts, requires significant time and financial investment. We developed automated classification models to identify eligible patients for prehospital clinical trials using EMS clinical notes and compared model performance to manual review.Methods: With eligibility criteria for an ongoing prehospital study of chest pain patients, we used EMS clinical notes (n = 1208) to manually classify patients as eligible, ineligible, and indeterminate. We randomly split these same records into training and test sets to develop and evaluate machine-learning (ML) algorithms using natural language processing (NLP) for feature (variable) selection. We compared models to the manual classification to calculate sensitivity, specificity, accuracy, positive predictive value, and F1 measure. We measured clinical expert time to perform review for manual and automated methods.Results: ML models' sensitivity, specificity, accuracy, positive predictive value, and F1 measure ranged from 0.93 to 0.98. Compared to manual classification (N = 363 records), the automated method excluded 90.9% of records as ineligible and leaving only 33 records for manual review.Conclusions: Our ML derived approach demonstrates the feasibility of developing a high-performing, automated classification system using EMS clinical notes to streamline the identification of a specific cardiac patient cohort. This efficient approach can be leveraged to facilitate prehospital patient-trial matching, patient phenotyping (i.e. influenza-like illness), and create prehospital patient registries.

Keywords: Subject terms: Emergency medical services; machine learning; natural language processing; patient phenotype; prehospital.