CLEMAPS: multiple alignment of protein structures based on conformational letters

Proteins. 2008 May 1;71(2):728-36. doi: 10.1002/prot.21739.

Abstract

CLEMAPS is a tool for multiple alignment of protein structures. It distinguishes itself from other existing algorithms for multiple structure alignment by the use of conformational letters, which are discretized states of 3D segmental structural states. A letter corresponds to a cluster of combinations of three angles formed by C(alpha) pseudobonds of four contiguous residues. A substitution matrix called CLESUM is available to measure the similarity between any two such letters. The input 3D structures are first converted to sequences of conformational letters. Each string of a fixed length is then taken as the center seed to search other sequences for neighbors of the seed, which are strings similar to the seed. A seed and its neighbors form a center-star, which corresponds to a fragment set of local structural similarity shared by many proteins. The detection of center-stars using CLESUM is extremely efficient. Local similarity is a necessary, but insufficient, condition for structural alignment. Once center-stars are found, the spatial consistency between any two stars are examined to find consistent star duads using atomic coordinates. Consistent duads are later joined to create a core for multiple alignment, which is further polished to produce the final alignment. The utility of CLEMAPS is tested on various protein structure ensembles.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computational Biology / methods
  • Protein Conformation*
  • Protein Structure, Secondary
  • Proteins / chemistry
  • Sequence Analysis, Protein / methods*

Substances

  • Proteins