Machine learning approaches for the prediction of signal peptides and other protein sorting signals

H Nielsen; S Brunak; G von Heijne

doi:10.1093/protein/12.1.3

Machine learning approaches for the prediction of signal peptides and other protein sorting signals

Protein Eng. 1999 Jan;12(1):3-9. doi: 10.1093/protein/12.1.3.

Authors

H Nielsen¹, S Brunak, G von Heijne

Affiliation

¹ Center for Biological Sequence Analysis, Department of Biotechnology, The Technical University of Denmark, Lyngby.

PMID: 10065704
DOI: 10.1093/protein/12.1.3

Abstract

Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.

Publication types

Comparative Study
Research Support, Non-U.S. Gov't
Review

MeSH terms

Algorithms
Codon, Initiator / analysis
Computer Simulation
Markov Chains
Methanococcus / chemistry
Models, Statistical
Neural Networks, Computer*
Pattern Recognition, Automated*
Protein Sorting Signals / analysis*

Substances

Codon, Initiator
Protein Sorting Signals