Prediction of Genotype Positivity in Patients With Hypertrophic Cardiomyopathy Using Machine Learning

Lusha W Liang; Michael A Fifer; Kohei Hasegawa; Mathew S Maurer; Muredach P Reilly; Yuichi J Shimada

doi:10.1161/CIRCGEN.120.003259

Prediction of Genotype Positivity in Patients With Hypertrophic Cardiomyopathy Using Machine Learning

Circ Genom Precis Med. 2021 Jun;14(3):e003259. doi: 10.1161/CIRCGEN.120.003259. Epub 2021 Apr 23.

Authors

Lusha W Liang¹, Michael A Fifer², Kohei Hasegawa³, Mathew S Maurer¹, Muredach P Reilly^{1

4}, Yuichi J Shimada¹

Affiliations

¹ Division of Cardiology, Department of Medicine (L.W.L., M.S.M., M.P.R., Y.J.S.), Columbia University Irving Medical Center, New York, NY.
² Cardiology Division, Department of Medicine (M.A.F.), Massachusetts General Hospital, Boston.
³ Department of Emergency Medicine (K.H.), Massachusetts General Hospital, Boston.
⁴ Irving Institute for Clinical and Translational Research (M.P.R.), Columbia University Irving Medical Center, New York, NY.

Abstract

Background: Genetic testing can determine family screening strategies and has prognostic and diagnostic value in hypertrophic cardiomyopathy (HCM). However, it can also pose a significant psychosocial burden. Conventional scoring systems offer modest ability to predict genotype positivity. The aim of our study was to develop a novel prediction model for genotype positivity in patients with HCM by applying machine learning (ML) algorithms.

Methods: We constructed 3 ML models using readily available clinical and cardiac imaging data of 102 patients from Columbia University with HCM who had undergone genetic testing (the training set). We validated model performance on 76 patients with HCM from Massachusetts General Hospital (the test set). Within the test set, we compared the area under the receiver operating characteristic curves (AUROCs) for the ML models against the AUROCs generated by the Toronto HCM Genotype Score (the Toronto score) and Mayo HCM Genotype Predictor (the Mayo score) using the Delong test and net reclassification improvement.

Results: Overall, 63 of the 178 patients (35%) were genotype positive. The random forest ML model developed in the training set demonstrated an AUROC of 0.92 (95% CI, 0.85-0.99) in predicting genotype positivity in the test set, significantly outperforming the Toronto score (AUROC, 0.77 [95% CI, 0.65-0.90], P=0.004, net reclassification improvement: P<0.001) and the Mayo score (AUROC, 0.79 [95% CI, 0.67-0.92], P=0.01, net reclassification improvement: P=0.001). The gradient boosted decision tree ML model also achieved significant net reclassification improvement over the Toronto score (P<0.001) and the Mayo score (P=0.03), with an AUROC of 0.87 (95% CI, 0.75-0.99). Compared with the Toronto and Mayo scores, all 3 ML models had higher sensitivity, positive predictive value, and negative predictive value.

Conclusions: Our ML models demonstrated a superior ability to predict genotype positivity in patients with HCM compared with conventional scoring systems in an external validation test set.

Keywords: cardiomyopathies; genes; genotype; machine learning; mutation.

Publication types

Clinical Trial
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Cardiomyopathy, Hypertrophic / genetics*
Female
Genetic Testing*
Genotype*
Humans
Machine Learning*
Male
Middle Aged
Models, Genetic*
Predictive Value of Tests
Risk Assessment
Risk Factors

Abstract

Publication types

MeSH terms

Grants and funding