Selection bias in the reported performances of AD classification pipelines

Alex F Mendelson; Maria A Zuluaga; Marco Lorenzi; Brian F Hutton; Sébastien Ourselin; Alzheimer's Disease Neuroimaging Initiative

doi:10.1016/j.nicl.2016.12.018

Selection bias in the reported performances of AD classification pipelines

Neuroimage Clin. 2016 Dec 24:14:400-416. doi: 10.1016/j.nicl.2016.12.018. eCollection 2017.

Authors

Alex F Mendelson¹, Maria A Zuluaga¹, Marco Lorenzi¹, Brian F Hutton², Sébastien Ourselin³; Alzheimer's Disease Neuroimaging Initiative

Affiliations

¹ Translational Imaging Group, Centre for Medical Image Computing University College London, London, UK.
² Institute of Nuclear Medicine, University College London, London, UK; Centre for Medical Radiation Physics, University of Wollongong, NSW, Australia.
³ Translational Imaging Group, Centre for Medical Image Computing University College London, London, UK; Dementia Research Centre, University College London, UK.

Abstract

The last decade has seen a great proliferation of supervised learning pipelines for individual diagnosis and prognosis in Alzheimer's disease. As more pipelines are developed and evaluated in the search for greater performance, only those results that are relatively impressive will be selected for publication. We present an empirical study to evaluate the potential for optimistic bias in classification performance results as a result of this selection. This is achieved using a novel, resampling-based experiment design that effectively simulates the optimisation of pipeline specifications by individuals or collectives of researchers using cross validation with limited data. Our findings indicate that bias can plausibly account for an appreciable fraction (often greater than half) of the apparent performance improvement associated with the pipeline optimisation, particularly in small samples. We discuss the consistency of our findings with patterns observed in the literature and consider strategies for bias reduction and mitigation.

Keywords: ADNI; Alzheimer's disease; Classification; Cross validation; Overfitting; Selection bias.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Alzheimer Disease / classification*
Alzheimer Disease / diagnostic imaging
Alzheimer Disease / epidemiology*
Bias
Humans
Image Interpretation, Computer-Assisted
Magnetic Resonance Imaging
Reproducibility of Results
Selection Bias

Abstract

Publication types

MeSH terms

Grants and funding