Tailored for Real-World: A Whole Slide Image Classification System Validated on Uncurated Multi-Site Data Emulating the Prospective Pathology Workload

Sci Rep. 2020 Feb 21;10(1):3217. doi: 10.1038/s41598-020-59985-2.

Abstract

Standard of care diagnostic procedure for suspected skin cancer is microscopic examination of hematoxylin & eosin stained tissue by a pathologist. Areas of high inter-pathologist discordance and rising biopsy rates necessitate higher efficiency and diagnostic reproducibility. We present and validate a deep learning system which classifies digitized dermatopathology slides into 4 categories. The system is developed using 5,070 images from a single lab, and tested on an uncurated set of 13,537 images from 3 test labs, using whole slide scanners manufactured by 3 different vendors. The system's use of deep-learning-based confidence scoring as a criterion to consider the result as accurate yields an accuracy of up to 98%, and makes it adoptable in a real-world setting. Without confidence scoring, the system achieved an accuracy of 78%. We anticipate that our deep learning system will serve as a foundation enabling faster diagnosis of skin cancer, identification of cases for specialist review, and targeted diagnostic classifications.

MeSH terms

  • Algorithms
  • Calibration
  • Cell Proliferation
  • Computer Simulation
  • Deep Learning
  • Humans
  • Image Interpretation, Computer-Assisted / methods
  • Image Processing, Computer-Assisted / methods*
  • Melanocytes / cytology
  • Neural Networks, Computer
  • Pathology / methods*
  • Pattern Recognition, Automated*
  • Prospective Studies
  • ROC Curve
  • Reproducibility of Results
  • Skin Neoplasms / diagnostic imaging*
  • Workload