Assigning spectrum-specific P-values to protein identifications by mass spectrometry

Bioinformatics. 2011 Apr 15;27(8):1128-34. doi: 10.1093/bioinformatics/btr089. Epub 2011 Feb 23.

Abstract

Motivation: Although many methods and statistical approaches have been developed for protein identification by mass spectrometry, the problem of accurate assessment of statistical significance of protein identifications remains an open question. The main issues are as follows: (i) statistical significance of inferring peptide from experimental mass spectra must be platform independent and spectrum specific and (ii) individual spectrum matches at the peptide level must be combined into a single statistical measure at the protein level.

Results: We present a method and software to assign statistical significance to protein identifications from search engines for mass spectrometric data. The approach is based on asymptotic theory of order statistics. The parameters of the asymptotic distributions of identification scores are estimated for each spectrum individually. The method relies on new unbiased estimators for parameters of extreme value distribution. The estimated parameters are used to assign a spectrum-specific P-value to each peptide-spectrum match. The protein-level confidence measure combines P-values of peptide-to-spectrum matches.

Conclusion: We extensively tested the method using triplicate mouse and yeast high-throughput proteomic experiments. The proposed statistical approach improves the sensitivity of protein identifications without compromising specificity. While the method was primarily designed to work with Mascot, it is platform-independent and is applicable to any search engine which outputs a single score for a peptide-spectrum match. We demonstrate this by testing the method in conjunction with X!Tandem.

Availability: The software is available for download at ftp://genetics.bwh.harvard.edu/SSPV/.

Contact: ssunyaev@rics.bwh.harvard.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Data Interpretation, Statistical
  • Databases, Protein
  • Mass Spectrometry / methods*
  • Mice
  • Peptides / chemistry
  • Proteins / analysis
  • Proteins / chemistry*
  • Proteomics
  • Saccharomyces cerevisiae Proteins / analysis
  • Saccharomyces cerevisiae Proteins / chemistry
  • Software

Substances

  • Peptides
  • Proteins
  • Saccharomyces cerevisiae Proteins