Optimizing vitiligo diagnosis with ResNet and Swin transformer deep learning models: a study on performance and interpretability

Sci Rep. 2024 Apr 21;14(1):9127. doi: 10.1038/s41598-024-59436-2.

Abstract

Vitiligo is a hypopigmented skin disease characterized by the loss of melanin. The progressive nature and widespread incidence of vitiligo necessitate timely and accurate detection. Usually, a single diagnostic test often falls short of providing definitive confirmation of the condition, necessitating the assessment by dermatologists who specialize in vitiligo. However, the current scarcity of such specialized medical professionals presents a significant challenge. To mitigate this issue and enhance diagnostic accuracy, it is essential to build deep learning models that can support and expedite the detection process. This study endeavors to establish a deep learning framework to enhance the diagnostic accuracy of vitiligo. To this end, a comparative analysis of five models including ResNet (ResNet34, ResNet50, and ResNet101 models) and Swin Transformer series (Swin Transformer Base, and Swin Transformer Large models), were conducted under the uniform condition to identify the model with superior classification capabilities. Moreover, the study sought to augment the interpretability of these models by selecting one that not only provides accurate diagnostic outcomes but also offers visual cues highlighting the regions pertinent to vitiligo. The empirical findings reveal that the Swin Transformer Large model achieved the best performance in classification, whose AUC, accuracy, sensitivity, and specificity are 0.94, 93.82%, 94.02%, and 93.5%, respectively. In terms of interpretability, the highlighted regions in the class activation map correspond to the lesion regions of the vitiligo images, which shows that it effectively indicates the specific category regions associated with the decision-making of dermatological diagnosis. Additionally, the visualization of feature maps generated in the middle layer of the deep learning model provides insights into the internal mechanisms of the model, which is valuable for improving the interpretability of the model, tuning performance, and enhancing clinical applicability. The outcomes of this study underscore the significant potential of deep learning models to revolutionize medical diagnosis by improving diagnostic accuracy and operational efficiency. The research highlights the necessity for ongoing exploration in this domain to fully leverage the capabilities of deep learning technologies in medical diagnostics.

Keywords: Class activation mapping; Dermoscopic images; Swin transformer; Vitiligo.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning*
  • Humans
  • Vitiligo* / diagnosis