A genome-wide spectrum of tandem repeat expansions in 338,963 humans

Cell. 2024 Apr 25;187(9):2336-2341.e5. doi: 10.1016/j.cell.2024.03.004. Epub 2024 Apr 5.

Abstract

The Genome Aggregation Database (gnomAD), widely recognized as the gold-standard reference map of human genetic variation, has largely overlooked tandem repeat (TR) expansions, despite the fact that TRs constitute ∼6% of our genome and are linked to over 50 human diseases. Here, we introduce the TR-gnomAD (https://wlcb.oit.uci.edu/TRgnomAD), a biobank-scale reference of 0.86 million TRs derived from 338,963 whole-genome sequencing (WGS) samples of diverse ancestries (39.5% non-European samples). TR-gnomAD offers critical insights into ancestry-specific disease prevalence using disparities in TR unit number frequencies among ancestries. Moreover, TR-gnomAD is able to differentiate between common, presumably benign TR expansions, which are prevalent in TR-gnomAD, from those potentially pathogenic TR expansions, which are found more frequently in disease groups than within TR-gnomAD. Together, TR-gnomAD is an invaluable resource for researchers and physicians to interpret TR expansions in individuals with genetic diseases.

Keywords: GWAS; TR-gnomAD; ancestries; expansion; genome aggregation; human genetics; missing heritability; rare diseases; tandem repeat; whole genome sequencing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Repeat Expansion / genetics
  • Databases, Genetic
  • Genome, Human*
  • Genome-Wide Association Study
  • Humans
  • Tandem Repeat Sequences* / genetics
  • Whole Genome Sequencing