An amino acid substitution matrix for protein conformation identification

J Bioinform Comput Biol. 2006 Jun;4(3):769-82. doi: 10.1142/s0219720006002156.

Abstract

Amino acid substitution matrices play an essential role in protein sequence alignment, a fundamental task in bioinformatics. Most widely used matrices, such as PAM matrices derived from homologous sequences and BLOSUM matrices derived from aligned segments of PROSITE, did not integrate conformation information in their construction. There are a few structure-based matrices, which are derived from limited data of structure alignment. Using databases PDB_SELECT and DSSP, we create a database of sequence-conformation blocks which explicitly represent sequence-structure relationship. Members in a block are identical in conformation and are highly similar in sequence. From this block database, we derive a conformation-specific amino acid substitution matrix CBSM60. The matrix shows an improved performance in conformational segment search and homolog detection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Amino Acid Substitution*
  • Computational Biology / methods*
  • Databases, Protein
  • Molecular Sequence Data
  • Protein Conformation*
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Sequence Alignment
  • Sequence Analysis, Protein

Substances

  • Proteins