NDB-UFES: An oral cancer and leukoplakia dataset composed of histopathological images and patient data

Data Brief. 2023 Apr 7:48:109128. doi: 10.1016/j.dib.2023.109128. eCollection 2023 Jun.

Abstract

The gold standard for the diagnosis of oral cancer is the microscopic analysis of specimens removed preferentially through incisional biopsies of oral mucosa with a clinically detected suspicious lesion. This dataset contains captured histopathological images of oral squamous cell carcinoma and leukoplakia. A total of 237 images were captured, 89 leukoplakia with dysplasia images, 57 leukoplakia without dysplasia images and 91 carcinoma images. The images were captured with an optical light microscope, using 10x and 40x objectives, attached to a microscope camera and visualized through a software. The images were saved in PNG format at 2048 × 1536 size pixels and they refer to hematoxylin-eosin stained histopathologic slides from biopsies performed between 2010 and 2021 in patients managed at the Oral Diagnosis project (NDB) of the Federal University of Espírito Santo (UFES). Oral leukoplakias were represented by samples with and without epithelial dysplasia. Since the diagnosis considers socio-demographic data (gender, age and skin color) as well as clinical data (tobacco use, alcohol consumption, sun exposure, fundamental lesion, type of biopsy, lesion color, lesion surface and lesion diagnosis), this information was also collected. So, our aim by releasing this dataset NDB-UFES is to provide a new dataset to be used by researchers in Artificial Intelligence (machine and deep learning) to develop tools to assist clinicians and pathologists in the automated diagnosis of oral potentially malignant disorders and oral squamous cell carcinoma.

Keywords: Dataset; Information sources; Mouth diseases; Oral leukoplakia; Squamous Cell carcinoma.