Clinical validation of a deep-learning-based bone age software in healthy Korean children

Ann Pediatr Endocrinol Metab. 2024 Apr;29(2):102-108. doi: 10.6065/apem.2346050.025. Epub 2024 Jan 24.

Abstract

Purpose: Bone age (BA) is needed to assess developmental status and growth disorders. We evaluated the clinical performance of a deep-learning-based BA software to estimate the chronological age (CA) of healthy Korean children.

Methods: This retrospective study included 371 healthy children (217 boys, 154 girls), aged between 4 and 17 years, who visited the Department of Pediatrics for health check-ups between January 2017 and December 2018. A total of 553 left-hand radiographs from 371 healthy Korean children were evaluated using a commercial deep-learning-based BA software (BoneAge, Vuno, Seoul, Korea). The clinical performance of the deep learning (DL) software was determined using the concordance rate and Bland-Altman analysis via comparison with the CA.

Results: A 2-sample t-test (P<0.001) and Fisher exact test (P=0.011) showed a significant difference between the normal CA and the BA estimated by the DL software. There was good correlation between the 2 variables (r=0.96, P<0.001); however, the root mean square error was 15.4 months. With a 12-month cutoff, the concordance rate was 58.8%. The Bland-Altman plot showed that the DL software tended to underestimate the BA compared with the CA, especially in children under the age of 8.3 years.

Conclusion: The DL-based BA software showed a low concordance rate and a tendency to underestimate the BA in healthy Korean children.

Keywords: Age determination by skeleton; Child; Child health; Deep learning; Software.