Differentiating ischemic stroke patients from healthy subjects using a large-scale, retrospective EEG database and machine learning methods

William Peterson; Nithya Ramakrishnan; Krag Browder; Nerses Sanossian; Peggy Nguyen; Ezekiel Fink

doi:10.1016/j.jstrokecerebrovasdis.2024.107714

Differentiating ischemic stroke patients from healthy subjects using a large-scale, retrospective EEG database and machine learning methods

J Stroke Cerebrovasc Dis. 2024 Jun;33(6):107714. doi: 10.1016/j.jstrokecerebrovasdis.2024.107714. Epub 2024 Apr 16.

Authors

William Peterson¹, Nithya Ramakrishnan², Krag Browder³, Nerses Sanossian⁴, Peggy Nguyen⁵, Ezekiel Fink⁶

Affiliations

¹ University of Virginia, Charlottesville, VA, United States. Electronic address: wcp7cp@virginia.edu.
² Baylor College of Medicine, Houston, TX, United States.
³ Aspen Insights, Dallas, TX, United States.
⁴ Roxanna Todd Hodges Stroke Program, United States; Keck School of Medicine of the University of Southern California, United States.
⁵ Keck School of Medicine of the University of Southern California, United States.
⁶ Houston Hospital, Houston, TX, United States; Weill Cornell School of Medicine Sciences, New York, NY, United States.

PMID: 38636829
DOI: 10.1016/j.jstrokecerebrovasdis.2024.107714

Abstract

Objectives: We set out to develop a machine learning model capable of distinguishing patients presenting with ischemic stroke from a healthy cohort of subjects. The model relies on a 3-min resting electroencephalogram (EEG) recording from which features can be computed.

Materials and methods: Using a large-scale, retrospective database of EEG recordings and matching clinical reports, we were able to construct a dataset of 1385 healthy subjects and 374 stroke patients. With subjects often producing more than one recording per session, the final dataset consisted of 2401 EEG recordings (63% healthy, 37% stroke).

Results: Using a rich set of features encompassing both the spectral and temporal domains, our model yielded an AUC of 0.95, with a sensitivity and specificity of 93% and 86%, respectively. Allowing for multiple recordings per subject in the training set boosted sensitivity by 7%, attributable to a more balanced dataset.

Conclusions: Our work demonstrates strong potential for the use of EEG in conjunction with machine learning methods to distinguish stroke patients from healthy subjects. Our approach provides a solution that is not only timely (3-minutes recording time) but also highly precise and accurate (AUC: 0.95).

Keywords: Electroencephalogram (EEG); Feature engineering; Ischemic stroke; Large vessel occlusion; Machine learning; Prehospital stroke scale.

MeSH terms

Adult
Aged
Aged, 80 and over
Brain / physiopathology
Brain Waves*
Case-Control Studies
Databases, Factual*
Diagnosis, Computer-Assisted
Diagnosis, Differential
Electroencephalography*
Female
Humans
Ischemic Stroke* / diagnosis
Ischemic Stroke* / physiopathology
Machine Learning*
Male
Middle Aged
Predictive Value of Tests*
Reproducibility of Results
Retrospective Studies
Signal Processing, Computer-Assisted
Time Factors