Leveraging health systems data to characterize a large effect variant conferring risk for liver disease in Puerto Ricans

Am J Hum Genet. 2021 Nov 4;108(11):2099-2111. doi: 10.1016/j.ajhg.2021.09.016. Epub 2021 Oct 21.

Abstract

The integration of genomic data into health systems offers opportunities to identify genomic factors underlying the continuum of rare and common disease. We applied a population-scale haplotype association approach based on identity-by-descent (IBD) in a large multi-ethnic biobank to a spectrum of disease outcomes derived from electronic health records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population scale can facilitate strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.

Keywords: electronic health records; identity-by-descent; liver disease; liver serum measures; phenome wide association studies; population genetics; statistical genetics.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • ATP Binding Cassette Transporter, Subfamily B / genetics
  • Delivery of Health Care / organization & administration*
  • Electronic Health Records
  • Genetic Predisposition to Disease*
  • Haplotypes
  • Heterozygote
  • Hispanic or Latino / genetics
  • Homozygote
  • Humans
  • Liver Diseases / genetics*
  • Puerto Rico

Substances

  • ATP Binding Cassette Transporter, Subfamily B
  • multidrug resistance protein 3