Differentiating ischemic stroke patients from healthy subjects using a large-scale, retrospective EEG database and machine learning methods

J Stroke Cerebrovasc Dis. 2024 Jun;33(6):107714. doi: 10.1016/j.jstrokecerebrovasdis.2024.107714. Epub 2024 Apr 16.

Abstract

Objectives: We set out to develop a machine learning model capable of distinguishing patients presenting with ischemic stroke from a healthy cohort of subjects. The model relies on a 3-min resting electroencephalogram (EEG) recording from which features can be computed.

Materials and methods: Using a large-scale, retrospective database of EEG recordings and matching clinical reports, we were able to construct a dataset of 1385 healthy subjects and 374 stroke patients. With subjects often producing more than one recording per session, the final dataset consisted of 2401 EEG recordings (63% healthy, 37% stroke).

Results: Using a rich set of features encompassing both the spectral and temporal domains, our model yielded an AUC of 0.95, with a sensitivity and specificity of 93% and 86%, respectively. Allowing for multiple recordings per subject in the training set boosted sensitivity by 7%, attributable to a more balanced dataset.

Conclusions: Our work demonstrates strong potential for the use of EEG in conjunction with machine learning methods to distinguish stroke patients from healthy subjects. Our approach provides a solution that is not only timely (3-minutes recording time) but also highly precise and accurate (AUC: 0.95).

Keywords: Electroencephalogram (EEG); Feature engineering; Ischemic stroke; Large vessel occlusion; Machine learning; Prehospital stroke scale.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Brain / physiopathology
  • Brain Waves*
  • Case-Control Studies
  • Databases, Factual*
  • Diagnosis, Computer-Assisted
  • Diagnosis, Differential
  • Electroencephalography*
  • Female
  • Humans
  • Ischemic Stroke* / diagnosis
  • Ischemic Stroke* / physiopathology
  • Machine Learning*
  • Male
  • Middle Aged
  • Predictive Value of Tests*
  • Reproducibility of Results
  • Retrospective Studies
  • Signal Processing, Computer-Assisted
  • Time Factors