Significant dispersed recurrent DNA sequences in the Escherichia coli genome. Several new groups

J Mol Biol. 1993 Feb 20;229(4):833-48. doi: 10.1006/jmbi.1993.1090.

Abstract

New computer and statistical methods were used to determine significant direct and inverted repeats in the Escherichia coli contig sequence collection of aggregate 1.6 x 10(6) base-pairs. Eight groups of mostly new structural repeat identities were uncovered. Apart from the high statistical significance of these repeat sequences, there are suggestive relationships of the group matches in terms of neighboring genes, of genomic distributions, of their texts, and of their potentials for secondary structure. Four of these groups are relatively numerous, 11 to 26 members, one is in coding sequences and three are in non-coding. The coding group consists of the ATP-activated transmembrane component of a typical high-affinity protein-binding transport system. One of the non-coding groups consists of a special rho-independent transcription termination signal closely following an operon. The gene neighbors of this group often appear to be involved in some way in processing RNA or DNA. A second non-coding group has, for one or both neighboring genes, a component of a system responding to stress or starvation for some nutrient.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Base Sequence
  • Chromosomes, Bacterial
  • DNA, Bacterial
  • Escherichia coli / genetics*
  • Exons
  • Genome, Bacterial*
  • Introns
  • Molecular Sequence Data
  • Repetitive Sequences, Nucleic Acid*

Substances

  • DNA, Bacterial