Identifying cross-disease components of genetic risk across hospital data in the UK Biobank

Adrian Cortes; Patrick K Albers; Calliope A Dendrou; Lars Fugger; Gil McVean

doi:10.1038/s41588-019-0550-4

Identifying cross-disease components of genetic risk across hospital data in the UK Biobank

Nat Genet. 2020 Jan;52(1):126-134. doi: 10.1038/s41588-019-0550-4. Epub 2019 Dec 23.

Authors

Adrian Cortes^#^{1

2}, Patrick K Albers^#¹, Calliope A Dendrou³, Lars Fugger^{2

4

5}, Gil McVean⁶

Affiliations

¹ Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
² Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, Division of Clinical Neurology, John Radcliffe Hospital, University of Oxford, Oxford, UK.
³ Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
⁴ MRC Human Immunology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, University of Oxford, Oxford, UK.
⁵ Danish National Research Foundation Centre PERSIMUNE, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark.
⁶ Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK. gil.mcvean@bdi.ox.ac.uk.

^# Contributed equally.

Abstract

Genetic risk factors frequently affect multiple common human diseases, providing insight into shared pathophysiological pathways and opportunities for therapeutic development. However, systematic identification of genetic profiles of disease risk is limited by the availability of both comprehensive clinical data on population-scale cohorts and the lack of suitable statistical methodology that can handle the scale of and differential power inherent in multi-phenotype data. Here, we develop a disease-agnostic approach to cluster the genetic risk profiles for 3,025 genome-wide independent loci across 19,155 disease classification codes from 320,644 participants in the UK Biobank, representing a large and heterogeneous population. We identify 339 distinct disease association profiles and use multiple approaches to link clusters to the underlying biological pathways. We show how clusters can decompose the variance and covariance in risk for disease, thereby identifying underlying biological processes and their impact. We demonstrate the use of clusters in defining disease relationships and their potential in informing therapeutic strategies.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Adult
Aged
Biological Specimen Banks*
Female
Gene-Environment Interaction
Genetic Diseases, Inborn / genetics*
Genetic Loci*
Genetic Predisposition to Disease*
Genome-Wide Association Study*
Humans
Male
Middle Aged
Phenotype
Polymorphism, Single Nucleotide*
Prospective Studies
Quantitative Trait, Heritable*
Risk Factors
United Kingdom

Abstract

Publication types

MeSH terms

Grants and funding