PROSPECT II: protein structure prediction program for genome-scale applications

Protein Eng. 2003 Sep;16(9):641-50. doi: 10.1093/protein/gzg081.

Abstract

A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Artificial Intelligence
  • Computational Biology
  • Databases, Nucleic Acid
  • Databases, Protein
  • Models, Molecular*
  • Protein Structure, Tertiary / genetics
  • Sequence Analysis, Protein*
  • Software*