Rapid motif-based prediction of circular permutations in multi-domain proteins

Bioinformatics. 2005 Apr 1;21(7):932-7. doi: 10.1093/bioinformatics/bti085.

Abstract

Motivation: Rearrangements of protein domains and motifs such as swaps and circular permutations (CPs) can produce erroneous results in searching sequence databases when using traditional methods based on linear sequence alignments. Circular permutations are also of biological relevance because they can help to better understand both protein evolution and functionality.

Results: We have developed an algorithm, RASPODOM, which is based on the classical recursive alignment scheme. Sequences are represented as strings of domains taken from precompiled resources of domain (motif) databases such as ProDom. The algorithm works several orders of magnitude faster than a reimplementation of the existing CP detection algorithm working on strings of amino acids, produces virtually no false positives and allows the discrimination of true CPs from 'intermediate' CPs (iCPs). Several true CPs which have not been reported in literature so far could be identified from Swiss-Prot/TrEMBL within minutes.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Amino Acid Motifs*
  • Computer Simulation
  • Databases, Protein*
  • Genetic Variation
  • Models, Chemical
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Structure, Tertiary
  • Proteins / analysis*
  • Proteins / chemistry*
  • Proteins / genetics
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins