[Establishment of a prognostic model for non-nephrotic membranous nephropathy based on unbalanced data]

Zhonghua Yi Xue Za Zhi. 2023 May 16;103(18):1386-1392. doi: 10.3760/cma.j.cn112137-20221115-02399.
[Article in Chinese]

Abstract

Objective: To explore the construction of a machine learning model based on unbalanced data to predict the progression of non-nephrotic membranous nephropathy. Methods: The clinical and pathological data of patients diagnosed with non-nephrotic membranous nephropathy by renal biopsy in Shanxi People's Hospital from January 2018 to December 2021 were retrospectively analyzed.The prediction models were constructed based on logistic regression, support vector machine (SVM) and light gradient boosting machine (lightGBM), respectively. The mixed sampling technology was used to process the unbalanced data, and the area under the receiver operating characteristic curve (AUC) was used to evaluate the predictive performance of the models. Finally, Shapley additive explanation (SHAP) was used to interpret the results of the optimal prediction model. Results: A total of 148 patients were included in the study, including 84 males and 64 females, with a mean age of (47.2±12.5) years. The follow-up time [M(Q1, Q3)] was 14(7, 20) months. Twenty-three patients (15.5%) achieved the renal end-point event in the study. The SVM model had the highest AUC (0.868, 95%CI: 0.813-0.925), followed by logistic regression (AUC=0.865, 95%CI: 0.755-0.899) and lightGBM (AUC=0.791, 95%CI: 0.690-0.882). The feature recursive elimination cross validation (RFECV) method based on random forest (RF) and the SHAP plot based on the SVM model showed that immunohistochemistry IgG, total protein (TP), anti-phospholipase A2 receptor (anti-PLA2R), blood chloride and D-Dimer were risk factors affecting the progress of non-nephrotic membranous nephropathy. Moreover, patients with high immunohistochemistry IgG, anti-PLA2R and D-Dimer had an increased risk of achieving the renal end-point event. Conclusion: The SVM model established in this study can effectively predict the progress of non-nephrotic membranous nephropathy, and provide a new method for the early identification of high-risk patients and precision therapy.

目的: 探索基于不平衡数据构建预测非肾病水平蛋白尿的膜性肾病预后的机器学习模型。 方法: 回顾性分析山西省人民医院2018年1月至2021年12月肾活检诊断为非肾病水平蛋白尿的膜性肾病患者的临床和病理资料。基于logistic回归、支持向量机(SVM)和轻量梯度提升(lightGBM)3种机器学习算法构建预测模型。采用混合采样技术处理不平衡数据,使用受试者工作特征曲线下面积(AUC)评估模型预测性能,运用Shapley加法解释(SHAP)对最佳性能模型的结果进行解释。 结果: 共纳入148例患者,男84例,女64例,年龄(47.2±12.5)岁,随访时间[MQ1Q3)]14(7,20)个月。23例(15.5%)患者发生肾脏终点事件。SVM模型的AUC值最高(0.868,95%CI:0.813~0.925),其次为logistic回归(AUC:0.865,95%CI:0.755~0.899)和lightGBM(AUC:0.791,95%CI:0.690~0.882)。基于随机森林的特征递归消除交叉验证(RFECV)方法和SVM模型的SHAP图显示,免疫组化IgG、血清总蛋白(TP)、血清抗磷脂酶A2受体抗体(anti-PLA2R)、血氯、D-二聚体是影响非肾病水平蛋白尿的膜性肾病预后的危险因素,其中免疫组化IgG、anti-PLA2R、D-二聚体水平越高,患者达到肾脏终点事件的风险越高。 结论: 本研究建立的SVM模型可有效预测非肾病水平蛋白尿的膜性肾病的预后,为早期识别高危患者及精准治疗提供了新方法。.

Publication types

  • English Abstract

MeSH terms

  • Adult
  • Female
  • Glomerulonephritis, Membranous* / drug therapy
  • Humans
  • Immunoglobulin G / therapeutic use
  • Kidney / pathology
  • Male
  • Middle Aged
  • Prognosis
  • Retrospective Studies

Substances

  • Immunoglobulin G