IEM: an algorithm for iterative enhancement of motifs using comparative genomics data

Comput Syst Bioinformatics Conf. 2007:6:227-35.

Abstract

Understanding gene regulation is a key step to investigating gene functions and their relationships. Many algorithms have been developed to discover transcription factor binding sites (TFBS); they are predominantly located in upstream regions of genes and contribute to transcription regulation if they are bound by a specific transcription factor. However, traditional methods focusing on finding motifs have shortcomings, which can be overcome by using comparative genomics data that is now increasingly available. Traditional methods to score motifs also have their limitations. In this paper, we propose a new algorithm called IEM to refine motifs using comparative genomics data. We show the effectiveness of our techniques with several data sets. Two sets of experiments were performed with comparative genomics data on five strains of P. aeruginosa. One set of experiments were performed with similar data on four species of yeast. The weighted conservation score proposed in this paper is an improvement over existing motif scores.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Base Sequence
  • Binding Sites
  • Chromosome Mapping / methods*
  • Molecular Sequence Data
  • Protein Binding
  • Sequence Alignment / methods
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid
  • Software*
  • Transcription Factors / genetics*

Substances

  • Transcription Factors