Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment

Science. 1993 Oct 8;262(5131):208-14. doi: 10.1126/science.8211139.

Abstract

A wealth of protein and DNA sequence data is being generated by genome projects and other sequencing efforts. A crucial barrier to deciphering these sequences and understanding the relations among them is the difficulty of detecting subtle local residue patterns common to multiple sequences. Such patterns frequently reflect similar molecular structures and biological properties. A mathematical definition of this "local multiple alignment" problem suitable for full computer automation has been used to develop a new and sensitive algorithm, based on the statistical method of iterative sampling. This algorithm finds an optimized local alignment model for N sequences in N-linear time, requiring only seconds on current workstations, and allows the simultaneous detection and optimization of multiple patterns and pattern repeats. The method is illustrated as applied to helix-turn-helix proteins, lipocalins, and prenyltransferases.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Carrier Proteins / chemistry*
  • Helix-Loop-Helix Motifs*
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Prenylation
  • Protein Structure, Secondary
  • Sequence Alignment / methods*
  • Software
  • Transferases / chemistry*

Substances

  • Carrier Proteins
  • Transferases

Associated data

  • GENBANK/L10416
  • PIR/S22843
  • SWISSPROT/P18898
  • SWISSPROT/P22007
  • SWISSPROT/Q02293