Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction

DNA Res. 2008 Feb 29;15(1):3-11. doi: 10.1093/dnares/dsm034. Epub 2008 Feb 7.

Abstract

Transcriptional regulation is the first level of regulation of gene expression and is therefore a major topic in computational biology. Genes with similar expression patterns can be assumed to be co-regulated at the transcriptional level by promoter sequences with a similar structure. Current approaches for modeling shared regulatory features tend to focus mainly on clustering of cis-regulatory sites. Here we introduce a Markov chain-based promoter structure model that uses both shared motifs and shared features from an input set of promoter sequences to predict candidate genes with similar expression. The model uses positional preference, order, and orientation of motifs. The trained model is used to score a genomic set of promoter sequences: high-scoring promoters are assumed to have a structure similar to the input sequences and are thus expected to drive similar expression patterns. We applied our model on two datasets in Caenorhabditis elegans and in Ciona intestinalis. Both computational and experimental verifications indicate that this model is capable of predicting candidate promoters driving similar expression patterns as the input-regulatory sequences. This model can be useful for finding promising candidate genes for wet-lab experiments and for increasing our understanding of transcriptional regulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Caenorhabditis elegans / metabolism
  • Ciona intestinalis / metabolism
  • DNA / chemistry*
  • Gene Expression Regulation*
  • Markov Chains
  • Models, Molecular*
  • Muscle Proteins / genetics
  • Nucleic Acid Conformation
  • Promoter Regions, Genetic*
  • Regulatory Elements, Transcriptional*

Substances

  • Muscle Proteins
  • DNA