AutoComBat: a generic method for harmonizing MRI-based radiomic features

Sci Rep. 2022 Jul 26;12(1):12762. doi: 10.1038/s41598-022-16609-1.

Abstract

The use of multicentric data is becoming essential for developing generalizable radiomic signatures. In particular, Magnetic Resonance Imaging (MRI) data used in brain oncology are often heterogeneous in terms of scanners and acquisitions, which significantly impact quantitative radiomic features. Various methods have been proposed to decrease dependency, including methods acting directly on MR images, i.e., based on the application of several preprocessing steps before feature extraction or the ComBat method, which harmonizes radiomic features themselves. The ComBat method used for radiomics may be misleading and presents some limitations, such as the need to know the labels associated with the "batch effect". In addition, a statistically representative sample is required and the applicability of a signature whose batch label is not present in the train set is not possible. This work aimed to compare a priori and a posteriori radiomic harmonization methods and propose a code adaptation to be machine learning compatible. Furthermore, we have developed AutoComBat, which aims to automatically determine the batch labels, using either MRI metadata or quality metrics as inputs of the proposed constrained clustering. A heterogeneous dataset consisting of high and low-grade gliomas coming from eight different centers was considered. The different methods were compared based on their ability to decrease relative standard deviation of radiomic features extracted from white matter and on their performance on a classification task using different machine learning models. ComBat and AutoComBat using image-derived quality metrics as inputs for batch assignment and preprocessing methods presented promising results on white matter harmonization, but with no clear consensus for all MR images. Preprocessing showed the best results on the T1w-gd images for the grading task. For T2w-flair, AutoComBat, using either metadata plus quality metrics or metadata alone as inputs, performs better than the conventional ComBat, highlighting its potential for data harmonization. Our results are MRI weighting, feature class and task dependent and require further investigations on other datasets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Brain / diagnostic imaging
  • Brain / pathology
  • Brain Neoplasms* / diagnostic imaging
  • Glioma* / diagnostic imaging
  • Glioma* / pathology
  • Humans
  • Machine Learning
  • Magnetic Resonance Imaging / methods