Establishment and validation of an artificial intelligence web application for predicting postoperative in-hospital mortality in patients with hip fracture: a National cohort study of 52,707 cases

Int J Surg. 2024 May 16. doi: 10.1097/JS9.0000000000001599. Online ahead of print.

Abstract

Background: In-hospital mortality following hip fractures is a significant concern, and accurate prediction of this outcome is crucial for appropriate clinical management. Nonetheless, there is a lack of effective prediction tools in clinical practice. By utilizing artificial intelligence and machine learning techniques, this study aims to develop a predictive model that can assist clinicians in identifying geriatric hip fracture patients at a higher risk of in-hospital mortality.

Methods: A total of 52,707 geriatric hip fracture patients treated with surgery from 90 hospitals were included in this study. The primary outcome was postoperative in-hospital mortality. The patients were randomly divided into two groups, with a ratio of 7:3. The majority of patients, assigned to the training cohort, were used to develop the AI models. The remaining patients, assigned to the validation cohort, were used to validate the models. Various machine learning algorithms, including logistic regression (LR), decision tree (DT), naïve Bayesian (NB), neural network (NN), eXGBoosting machine (eXGBM), and random forest (RF), were employed for model development. A comprehensive scoring system, incorporating 10 evaluation metrics, was developed to assess the prediction performance, with higher scores indicating superior predictive capability. Based on the best machine learning-based model, an AI application was developed on the Internet. In addition, a comparative testing of prediction performance between doctors and the AI application.

Findings: The eXGBM model exhibited the best prediction performance, with an AUC of 0.908 (95% CI: 0.881-0.932), as well as the highest accuracy (0.820), precision (0.817), specificity (0.814), and F1 score (0.822), and the lowest Brier score (0.120) and log loss (0.374). Additionally, the model showed favorable calibration, with a slope of 0.999 and an intercept of 0.028. According to the scoring system incorporating 10 evaluation metrics, the eXGBM model achieved the highest score (56), followed by the RF model (48) and NN model (41). The LR, DT, and NB models had total scores of 27, 30, and 13, respectively. The AI application has been deployed online at https://in-hospitaldeathinhipfracture-l9vhqo3l55fy8dkdvuskvu.streamlit.app/ , based on the eXGBM model. The comparative testing revealed that the AI application's predictive capabilities significantly outperformed those of the doctors in terms of AUC values (0.908 vs. 0.682, P <0.001).

Conclusions: The eXGBM model demonstrates promising predictive performance in assessing the risk of postoperative in-hospital mortality among geriatric hip fracture patients. The developed AI model serves as a valuable tool to enhance clinical decision-making.