Annotations capturing cell type-specific TF binding explain a large fraction of disease heritability

Hum Mol Genet. 2020 May 8;29(7):1057-1067. doi: 10.1093/hmg/ddz226.

Abstract

Regulatory variation plays a major role in complex disease and that cell type-specific binding of transcription factors (TF) is critical to gene regulation. However, assessing the contribution of genetic variation in TF-binding sites to disease heritability is challenging, as binding is often cell type-specific and annotations from directly measured TF binding are not currently available for most cell type-TF pairs. We investigate approaches to annotate TF binding, including directly measured chromatin data and sequence-based predictions. We find that TF-binding annotations constructed by intersecting sequence-based TF-binding predictions with cell type-specific chromatin data explain a large fraction of heritability across a broad set of diseases and corresponding cell types; this strategy of constructing annotations addresses both the limitation that identical sequences may be bound or unbound depending on surrounding chromatin context and the limitation that sequence-based predictions are generally not cell type-specific. We partitioned the heritability of 49 diseases and complex traits using stratified linkage disequilibrium (LD) score regression with the baseline-LD model (which is not cell type-specific) plus the new annotations. We determined that 100 bp windows around MotifMap sequenced-based TF-binding predictions intersected with a union of six cell type-specific chromatin marks (imputed using ChromImpute) performed best, with an 58% increase in heritability enrichment compared to the chromatin marks alone (11.6× vs. 7.3×, P = 9 × 10-14 for difference) and a 20% increase in cell type-specific signal conditional on annotations from the baseline-LD model (P = 8 × 10-11 for difference). Our results show that TF-binding annotations explain substantial disease heritability and can help refine genome-wide association signals.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites / genetics
  • Chromatin / genetics*
  • Computational Biology
  • Gene Expression Regulation / genetics
  • Genetic Diseases, Inborn / classification
  • Genetic Diseases, Inborn / genetics*
  • Genetic Diseases, Inborn / pathology
  • Humans
  • Linkage Disequilibrium / genetics
  • Molecular Sequence Annotation*
  • Multifactorial Inheritance / genetics
  • Polymorphism, Single Nucleotide / genetics
  • Protein Binding / genetics
  • Transcription Factors / genetics*

Substances

  • Chromatin
  • Transcription Factors