Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits

Nat Genet. 2021 Jun;53(6):779-786. doi: 10.1038/s41588-021-00865-4. Epub 2021 May 10.

Abstract

Long-read sequencing (LRS) promises to improve the characterization of structural variants (SVs). We generated LRS data from 3,622 Icelanders and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions). We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association of a rare deletion in PCSK9 with lower low-density lipoprotein (LDL) cholesterol levels, compared to the population average. We also discovered an association of a multiallelic SV in ACAN with height; we found 11 alleles that differed in the number of a 57-bp-motif repeat and observed a linear relationship between the number of repeats carried and height. These results show that SVs can be accurately characterized at the population scale using LRS data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.

MeSH terms

  • Alleles
  • Cholesterol, LDL / metabolism
  • Chromosomes, Human / genetics
  • Disease / genetics*
  • Female
  • Gene Frequency / genetics
  • Genomic Structural Variation*
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Iceland
  • Linear Models
  • Male
  • Proprotein Convertase 9 / genetics
  • Quantitative Trait, Heritable*
  • Recombination, Genetic / genetics
  • Sequence Deletion / genetics

Substances

  • Cholesterol, LDL
  • Proprotein Convertase 9