A unified encyclopedia of human functional DNA elements through fully automated annotation of 164 human cell types

Genome Biol. 2019 Aug 28;20(1):180. doi: 10.1186/s13059-019-1784-2.

Abstract

Semi-automated genome annotation methods such as Segway take as input a set of genome-wide measurements such as of histone modification or DNA accessibility and output an annotation of genomic activity in the target cell type. Here we present annotations of 164 human cell types using 1615 data sets. To produce these annotations, we automated the label interpretation step to produce a fully automated annotation strategy. Using these annotations, we developed a measure of the importance of each genomic position called the "conservation-associated activity score." We further combined all annotations into a single, cell type-agnostic encyclopedia that catalogs all human regulatory elements.

Keywords: Chromatin; Genomics; Machine learning.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Automation
  • Cell Line
  • DNA / genetics*
  • Databases, Genetic*
  • Humans
  • Machine Learning
  • Molecular Sequence Annotation*
  • Phenotype
  • Transcription, Genetic

Substances

  • DNA