Machine learning models developed and internally validated for predicting chronicity in pediatric immune thrombocytopenia

J Thromb Haemost. 2024 Apr;22(4):1167-1178. doi: 10.1016/j.jtha.2023.12.006. Epub 2023 Dec 15.

Abstract

Background: Primary immune thrombocytopenia (ITP) in children is typically self-limiting; however, 20% to 30% of patients may experience prolonged thrombocytopenia lasting over a year. The challenge is predicting chronicity to ensure personalized treatment approaches.

Objectives: To address this issue, we developed and internally validated 4 machine learning (ML) models using demographic and immunologic characteristics to predict the likelihood of chronicity.

Methods: The present study was conducted at Beijing Children's Hospital from June 2018 to December 2021, aiming to develop predictive models for determining the chronicity of pediatric ITP. Four ML models, based on a logistic regression classifier, random forest classifier, eXtreme Gradient Boosting (XGBoost), and support vector machine, were employed. These models used a set of 16 variables, including 14 immunologic and 2 demographic predictors. The performance evaluation criteria included prediction accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve (AUROC).

Results: Data were collected from 662 patients who were randomly assigned to either a training dataset or a testing dataset using a random number generator. Among them, 26.5% had chronic disease. All models performed well, with AUROC values ranging from 0.81 to 0.84, and XGBoost was selected for its highest AUROC score and interpretability in constructing the predictive model. Age, T helper 17, T helper 17-to-regulatory T cell ratio, T helper 1, and double-negative T cells were identified as significant predictors by the XGBoost algorithm.

Conclusion: We developed a precise predictive model for chronicity in pediatric ITP using ML during the initial phase. The XGBoost model achieved high predictive accuracy by using individual patient clinical parameters and demonstrated commendable interpretability.

Keywords: children; chronicity; immune thrombocytopenia; machine learning; prediction.

MeSH terms

  • Algorithms
  • Area Under Curve
  • Child
  • Humans
  • Machine Learning
  • Purpura, Thrombocytopenic, Idiopathic* / diagnosis
  • Thrombocytopenia* / diagnosis