Enhanced recognition of protein transmembrane domains with prediction-based structural profiles

Bioinformatics. 2006 Feb 1;22(3):303-9. doi: 10.1093/bioinformatics/bti784. Epub 2005 Nov 17.

Abstract

Motivation: Membrane domain prediction has recently been re-evaluated by several groups, suggesting that the accuracy of existing methods is still rather limited. In this work, we revisit this problem and propose novel methods for prediction of alpha-helical as well as beta-sheet transmembrane (TM) domains. The new approach is based on a compact representation of an amino acid residue and its environment, which consists of predicted solvent accessibility and secondary structure of each amino acid. A recently introduced method for solvent accessibility prediction trained on a set of soluble proteins is used here to indicate segments of residues that are predicted not to be accessible to water and, therefore, may be 'buried' in the membrane. While evolutionary profiles in the form of a multiple alignment are used to derive these simple 'structural profiles', they are not used explicitly for the membrane domain prediction and the overall number of parameters in the model is significantly reduced. This offers the possibility of a more reliable estimation of the free parameters in the model with a limited number of experimentally resolved membrane protein structures.

Results: Using cross-validated training on available sets of structurally resolved and non-redundant alpha and beta membrane proteins, we demonstrate that membrane domain prediction methods based on such a compact representation outperform approaches that utilize explicitly evolutionary profiles and multiple alignments. Moreover, using an external evaluation by the TMH Benchmark server we show that our final prediction protocol for the TM helix prediction is competitive with the state-of-the-art methods, achieving per-residue accuracy of approximately 89% and per-segment accuracy of approximately 80% on the set of high resolution structures used by the TMH Benchmark server. At the same time the observed rates of confusion with signal peptides and globular proteins are the lowest among the tested methods. The new method is available online at http://minnou.cchmc.org.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Artificial Intelligence
  • Cell Membrane / chemistry*
  • Membrane Proteins / analysis
  • Membrane Proteins / chemistry*
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods
  • Protein Conformation
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Software
  • Solvents / chemistry
  • Structure-Activity Relationship

Substances

  • Membrane Proteins
  • Solvents