Novel Machine Learning Identifies Five Asthma Phenotypes Using Cluster Analysis of Real-World Data

J Allergy Clin Immunol Pract. 2024 Apr 27:S2213-2198(24)00420-3. doi: 10.1016/j.jaip.2024.04.035. Online ahead of print.

Abstract

Background: Asthma classification into different sub-phenotypes is important to guide personalized therapy and improve outcomes.

Objectives: This study sought to further explore asthma heterogeneity through determination of multiple patient groups by using novel machine learning (ML) approaches and large-scale real-world data.

Methods: We used electronic health records of patients with asthma followed at the Cleveland Clinic between 2010 and 2021. We employed k-prototype unsupervised ML to develop a clustering model where predictors were age, gender, race, body mass index (BMI), pre- and post-bronchodilator (BD) spirometry measurements, and the usage of inhaled/systemic steroids. We applied elbow and silhouette plots to select the optimal number of clusters. These clusters were then evaluated through LightGBM's supervised ML approach on their cross validated F1 score to support their distinctiveness.

Results: Data from 13,498 patients with asthma with available post-BD spirometry measurements were extracted to identify 5 stable clusters. Cluster 1 included a young non-severe asthma population with normal lung function and higher frequency of acute exacerbation (0.8 /patient-year). Cluster 2 had the highest BMI (mean (SD): 44.44 (7.83) kg/m2), and the highest proportion of female (77.5%) and African Americans (28.9%). Cluster 3 comprised patients with normal lung function. Cluster 4 included patients with lower FEV1% of 77.03 (12.79) and poor response to bronchodilators. Cluster 5 had the lowest FEV1% of 68.08 (15.02), the highest post-BD reversibility, and the highest proportion of severe asthma (44.9%) and blood eosinophilia (>300 cells/μL) (34.8%).

Conclusion: Using real-world data and unsupervised ML, we classified asthma into 5 clinically important sub-phenotypes where group-specific asthma treatment and management strategies can be designed and deployed.

Keywords: asthma; asthma phenotype; cluster analysis; machine learning.