A Multi-Ancestry Polygenic Risk Score for Coronary Heart Disease Based on an Ancestrally Diverse Genome-Wide Association Study and Population-Specific Optimization

medRxiv [Preprint]. 2023 Jun 6:2023.06.02.23290896. doi: 10.1101/2023.06.02.23290896.

Abstract

Background: Predictive performance of polygenic risk scores (PRS) varies across populations. To facilitate equitable clinical use, we developed PRS for coronary heart disease (PRSCHD) for 5 genetic ancestry groups.

Methods: We derived ancestry-specific and multi-ancestry PRSCHD based on pruning and thresholding (PRSP+T) and continuous shrinkage priors (PRSCSx) applied on summary statistics from the largest multi-ancestry genome-wide meta-analysis for CHD to date, including 1.1 million participants from 5 continental populations. Following training and optimization of PRSCHD in the Million Veteran Program, we evaluated predictive performance of the best performing PRSCHD in 176,988 individuals across 9 cohorts of diverse genetic ancestry.

Results: Multi-ancestry PRSP+T outperformed ancestry specific PRSP+T across a range of tuning values. In training stage, for all ancestry groups, PRSCSx performed better than PRSP+T and multi-ancestry PRS outperformed ancestry-specific PRS. In independent validation cohorts, the selected multi-ancestry PRSP+T demonstrated the strongest association with CHD in individuals of South Asian (SAS) and European (EUR) ancestry (OR per 1SD[95% CI]; 2.75[2.41-3.14], 1.65[1.59-1.72]), followed by East Asian (EAS) (1.56[1.50-1.61]), Hispanic/Latino (HIS) (1.38[1.24-1.54]), and weakest in African (AFR) ancestry (1.16[1.11-1.21]). The selected multi-ancestry PRSCSx showed stronger associacion with CHD in comparison within each ancestry group where the association was strongest in SAS (2.67[2.38-3.00]) and EUR (1.65[1.59-1.71]), progressively decreasing in EAS (1.59[1.54-1.64]), HIS (1.51[1.35-1.69]), and lowest in AFR (1.20[1.15-1.26]).

Conclusions: Utilizing diverse summary statistics from a large multi-ancestry genome-wide meta-analysis led to improved performance of PRSCHD in most ancestry groups compared to single-ancestry methods. Improvement of predictive performance was limited, specifically in AFR and HIS, despite use of one of the largest and most diverse set of training and validation cohorts to date. This highlights the need for larger GWAS datasets of AFR and HIS individuals to enhance performance of PRSCHD.

Publication types

  • Preprint