A Neoteric Feature Extraction Technique to Predict the Survival of Gastric Cancer Patients

Diagnostics (Basel). 2024 May 1;14(9):954. doi: 10.3390/diagnostics14090954.

Abstract

Background: At the time of cancer diagnosis, it is crucial to accurately classify malignant gastric tumors and the possibility that patients will survive.

Objective: This study aims to investigate the feasibility of identifying and applying a new feature extraction technique to predict the survival of gastric cancer patients.

Methods: A retrospective dataset including the computed tomography (CT) images of 135 patients was assembled. Among them, 68 patients survived longer than three years. Several sets of radiomics features were extracted and were incorporated into a machine learning model, and their classification performance was characterized. To improve the classification performance, we further extracted another 27 texture and roughness parameters with 2484 superficial and spatial features to propose a new feature pool. This new feature set was added into the machine learning model and its performance was analyzed. To determine the best model for our experiment, Random Forest (RF) classifier, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Naïve Bayes (NB) (four of the most popular machine learning models) were utilized. The models were trained and tested using the five-fold cross-validation method.

Results: Using the area under ROC curve (AUC) as an evaluation index, the model that was generated using the new feature pool yields AUC = 0.98 ± 0.01, which was significantly higher than the models created using the traditional radiomics feature set (p < 0.04). RF classifier performed better than the other machine learning models.

Conclusions: This study demonstrated that although radiomics features produced good classification performance, creating new feature sets significantly improved the model performance.

Keywords: K-Nearest Neighbors (KNN); Naïve Bayes (NB); area under ROC curve (AUC); feature extraction; gastric cancer; radiomics features; random forest (RF) classifier; support vector machine (SVM).