Evaluating Augmentation Approaches for Deep Learning-based Major Depressive Disorder Diagnosis with Raw Electroencephalogram Data

bioRxiv [Preprint]. 2023 Dec 18:2023.12.15.571938. doi: 10.1101/2023.12.15.571938.

Abstract

While deep learning methods are increasingly applied in research contexts for neuropsychiatric disorder diagnosis, small dataset size limits their potential for clinical translation. Data augmentation (DA) could address this limitation, but the utility of EEG DA methods remains relatively underexplored in neuropsychiatric disorder diagnosis. In this study, we train a model for major depressive disorder diagnosis. We then evaluate the utility of 6 EEG DA approaches. Importantly, to remove the bias that could be introduced by comparing performance for models trained on larger augmented training sets to models trained on smaller baseline sets, we also introduce a new baseline trained on duplicate training data to better. We lastly examine the effects of the DA approaches upon representations learned by the model with a pair of explainability analyses. We find that while most approaches boost model performance, they do not improve model performance beyond that of simply using a duplicate training set without DA. The exception to this is channel dropout augmentation, which does improve model performance. These findings suggest the importance of comparing EEG DA methods to a baseline with a duplicate training set of equal size to the augmented training set. We also found that some DA methods increased model robustness to frequency (Fourier transform surrogates) and channel (channel dropout) perturbation. While our findings on EEG DA efficacy are restricted to our dataset and model, we hope that future studies on deep learning for small EEG datasets and on new EEG DA methods will find our findings helpful.

Keywords: data augmentation; deep learning; electroencephalography; explainable AI; major depressive disorder.

Publication types

  • Preprint