Flexgrepps--flexible greedy peptide pool search: computation of near-optimal sets of degenerate polypeptides for antigenic screening

J Bioinform Comput Biol. 2012 Oct;10(5):1250009. doi: 10.1142/S0219720012500096. Epub 2012 Jun 22.

Abstract

Although synthesizing and utilizing individual peptides and DNA primers has become relatively inexpensive, massively parallel probing and next-generation sequencing approaches have dramatically increased the number of molecules that can be subjected to screening; this, in turn, requires vast numbers of peptides and therefore results in significant expenses. To alleviate this issue, pools of related molecules are often used to downselect prior to testing individual sequences. A computational selection process to create pools of related sequences at large scale has not been reported for peptides. In the case of PCR primers, there have been successful attempts to address this problem by designing degenerate primers that can be produced at the same cost as conventional, unique primers and then be used to amplify several different genomic regions. We present an algorithm, "FlexGrePPS" (Flexible Greedy Peptide Pool Search), that can create a near-optimal set of peptide pools. This approach is also applicable to nucleotide sequences and outperforms most DNA primer selection programs. For the proteomic compression with FlexGrePPS, the main body of our work presented here, we demonstrate the feasibility of the computation of an exhaustive cover of pathogenic proteomes with degenerate peptides that lend themselves to antigenic screening. Furthermore, we present preliminary data that demonstrate the experimental utility of highly degenerate peptides for antigenic screening. FlexGrePPS provides a near-optimal solution for proteomic compression and there are no programs available for comparison. We also demonstrate computational performance of our GreedyPrime implementation, which is a modified version of FlexGrePPS applicable to the design of degenerate primers and is comparable to existing programs for the design of degenerate primers. Specifically, we focus on the comparisons with PAMPS and DPS-DIP, software tools that have recently been shown to be superior to other methods. FlexGrePPS forms the foundation of a novel antigenic screening methodology that is based on the representation of an entire proteome by near-optimal degenerate peptide pools. Our preliminary wet lab data indicate that the approach will likely prove successful in comprehensive wet lab studies, and hence will dramatically reduce the expenses for antigenic screening and make whole proteome screening feasible. Although FlexGrePPS was designed for computational performance in order to handle vast data sets, there is the very surprising finding that even for small data sets the primer design version of FlexGrePPS, GreedyPrime, offers similar or even superior results for MP-DPD and most MDPD instances when compared to existing methods; despite the much longer run times, other approaches did not fare significantly better in reducing the original data sets to degenerate primers. The FlexGrePPS and GreedyPrime programs are available at no charge under the GNU LGPL license at http://sourceforge.net/projects/flexgrepps/.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Antigens / chemistry*
  • DNA Primers / chemistry
  • Genome
  • Peptides / chemistry*
  • Proteomics / methods*
  • Software*

Substances

  • Antigens
  • DNA Primers
  • Peptides