Analysis of eukaryotic promoter sequences reveals a systematically occurring CT-signal

Nucleic Acids Res. 1995 Apr 11;23(7):1223-30. doi: 10.1093/nar/23.7.1223.

Abstract

A general data study of eukaryotic promoter sequences from widely different species is presented. Mammalian promoters with known transcription initiation sites represented the largest subclass of the data, and for this group neural network algorithms were trained to predict the location of the initiation site in a test set. The prediction accuracy of this local method was higher than what could be expected from the known non-local structure of eukaryotic promoters. Subsequent analysis revealed, besides the consensus of the two known important subregions: the TATA-box TATAAA and the Cap-signal CA, a CT-signal positioned on the average seven nucleotides downstream of the transcription initiation site. The consensus of the CT-signal is CTNCNG. The details of this core promoter element were disclosed using multiple alignment and have earlier only been described in a few isolated examples.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence
  • Consensus Sequence
  • DNA / genetics
  • Eukaryotic Cells
  • Molecular Sequence Data
  • Neural Networks, Computer
  • Oligodeoxyribonucleotides / genetics*
  • Promoter Regions, Genetic*
  • Sequence Homology, Nucleic Acid
  • Signal Transduction / genetics
  • Species Specificity
  • TATA Box
  • Transcription, Genetic

Substances

  • Oligodeoxyribonucleotides
  • DNA