Prediction of TNFRSF9 expression and molecular pathological features in thyroid cancer using machine learning to construct Pathomics models

Endocrine. 2024 May 16. doi: 10.1007/s12020-024-03862-9. Online ahead of print.

Abstract

Background: The TNFRSF9 molecule is pivotal in thyroid carcinoma (THCA) development. This study utilizes Pathomics techniques to predict TNFRSF9 expression in THCA tissue and explore its molecular mechanisms.

Methods: Transcriptome data, pathology images, and clinical information from the cancer genome atlas (TCGA) were analyzed. Image segmentation and feature extraction were performed using the OTSU's algorithm and pyradiomics package. The dataset was split for training and validation. Features were selected using maximum relevance minimum redundancy recursive feature elimination (mRMR_RFE) and modeling conducted with the gradient boosting machine (GBM) algorithm. Model evaluation included receiver operating characteristic curve (ROC) analysis. The Pathomics model output a probabilistic pathomics score (PS) for gene expression prediction, with its prognostic value assessed in TNFRSF9 expression groups. Subsequent analysis involved gene set variation analysis (GSVA), immune gene expression, cell abundance, immunotherapy susceptibility, and gene mutation analysis.

Results: High TNFRSF9 expression correlated with worsened progression-free interval (PFI) and acted as an independent risk factor [hazard ratio (HR) = 2.178, 95% confidence interval (CI) 1.045-4.538, P = 0.038]. Nine pathohistological features were identified. The GBM Pathomics model demonstrated good prediction efficacy [area under the curve (AUC) 0.819 and 0.769] and clinical benefits. High PS was a PFI risk factor (HR = 2.156, 95% CI 1.047-4.440, P = 0.037). Patients with high PS potentially exhibited enriched pathways, increased TIGIT gene expression, Tregs infiltration (P < 0.0001), and higher rates of gene mutations (BRAF, TTN, TG).

Conclusions: The GBM Pathomics model constructed based on the pathohistological features of H&E-stained sections well predicted the expression level of TNFRSF9 molecules in THCA.

Keywords: TNFRSF9; Machine learning; Pathomics; Thyroid carcinoma.