TAMC: A deep-learning approach to predict motif-centric transcriptional factor binding activity based on ATAC-seq profile

PLoS Comput Biol. 2022 Sep 12;18(9):e1009921. doi: 10.1371/journal.pcbi.1009921. eCollection 2022 Sep.

Abstract

Determining transcriptional factor binding sites (TFBSs) is critical for understanding the molecular mechanisms regulating gene expression in different biological conditions. Biological assays designed to directly mapping TFBSs require large sample size and intensive resources. As an alternative, ATAC-seq assay is simple to conduct and provides genomic cleavage profiles that contain rich information for imputing TFBSs indirectly. Previous footprint-based tools are inheritably limited by the accuracy of their bias correction algorithms and the efficiency of their feature extraction models. Here we introduce TAMC (Transcriptional factor binding prediction from ATAC-seq profile at Motif-predicted binding sites using Convolutional neural networks), a deep-learning approach for predicting motif-centric TF binding activity from paired-end ATAC-seq data. TAMC does not require bias correction during signal processing. By leveraging a one-dimensional convolutional neural network (1D-CNN) model, TAMC make predictions based on both footprint and non-footprint features at binding sites for each TF and outperforms existing footprinting tools in TFBS prediction particularly for ATAC-seq data with limited sequencing depth.

MeSH terms

  • Binding Sites / genetics
  • Chromatin Immunoprecipitation Sequencing*
  • Deep Learning*
  • Protein Binding / genetics
  • Transcription Factors / metabolism

Substances

  • Transcription Factors