The human noncoding genome defined by genetic diversity

Nat Genet. 2018 Mar;50(3):333-337. doi: 10.1038/s41588-018-0062-7. Epub 2018 Feb 26.

Abstract

Understanding the significance of genetic variants in the noncoding genome is emerging as the next challenge in human genomics. We used the power of 11,257 whole-genome sequences and 16,384 heptamers (7-nt motifs) to build a map of sequence constraint for the human species. This build differed substantially from traditional maps of interspecies conservation and identified regulatory elements among the most constrained regions of the genome. Using new Hi-C experimental data, we describe a strong pattern of coordination over 2 Mb where the most constrained regulatory elements associate with the most essential genes. Constrained regions of the noncoding genome are up to 52-fold enriched for known pathogenic variants as compared to unconstrained regions (21-fold when compared to the genome average). This map of sequence constraint across thousands of individuals is an asset to help interpret noncoding elements in the human genome, prioritize variants and reconsider gene units at a larger scale.

Publication types

  • Letter
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping / methods
  • Computational Biology
  • Conserved Sequence
  • Evolution, Molecular
  • Female
  • Genetic Variation*
  • Genome, Human*
  • Humans
  • Male
  • RNA, Untranslated / genetics*
  • Regulatory Sequences, Nucleic Acid

Substances

  • RNA, Untranslated