Diagnostic performance of classification trees and hematological functions in hematologic disorders: an application of multidimensional scaling and cluster analysis

BMC Med Inform Decis Mak. 2021 Nov 10;21(1):313. doi: 10.1186/s12911-021-01678-5.

Abstract

Background: Several hematological indices have been already proposed to discriminate between iron deficiency anemia (IDA) and β-thalassemia trait (βTT). This study compared the diagnostic performance of different hematological discrimination indices with decision trees and support vector machines, so as to discriminate IDA from βTT using multidimensional scaling and cluster analysis. In addition, decision trees were used to determine the diagnostic classification scheme of patients.

Methods: Consisting of 1178 patients with hypochromic microcytic anemia (708 patients with βTT and 470 patients with IDA), this cross-sectional study compared the diagnostic performance of 43 hematological discrimination indices with classification tree algorithms and support vector machines in order to discriminate IDA from βTT. Moreover, multidimensional scaling and cluster analysis were used to identify the homogeneous subgroups of discrimination methods with similar performance.

Results: All the classification tree algorithms except the LOTUS tree algorithm showed acceptable accuracy measures for discrimination between IDA and βTT in comparison with other hematological discrimination indices. The results indicated that the CRUISE and C5.0 tree algorithms had better diagnostic performance and efficiency among other discrimination methods. Moreover, the AUC of CRUISE and C5.0 tree algorithms indicated more precise classification with values of 0.940 and 0.999, indicating excellent diagnostic accuracy of such models. Moreover, the CRUISE and C5.0 tree algorithms showed that mean corpuscular volume can be considered as the main variable in discrimination between IDA and βTT.

Conclusions: CRUISE and C5.0 tree algorithms as powerful methods in data mining techniques can be used to develop accurate differential methods along with other laboratory parameters for the discrimination of IDA and βTT. In addition, the multidimensional scaling method and cluster analysis can be considered as the most appropriate techniques to determine the discrimination indices with similar performance for future hematological studies.

Keywords: C5.0 tree algorithm; CRUISE tree algorithm; Classification tree algorithms; Diagnosis; Hematological discrimination indices; Iron deficiency anemia (IDA); β‐thalassemia trait (βTT).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anemia, Iron-Deficiency* / diagnosis
  • Cluster Analysis
  • Cross-Sectional Studies
  • Diagnosis, Differential
  • Humans
  • Multidimensional Scaling Analysis*