Stable structure-approximating inverse protein folding in 2D hydrophobic-polar-cysteine (HPC) model

J Comput Biol. 2009 Jan;16(1):19-30. doi: 10.1089/cmb.2008.0096.

Abstract

The inverse protein folding problem is that of designing an amino acid sequence which folds into a prescribed conformation/structure. This problem arises in drug design where a particular structure is necessary to ensure proper protein-protein interactions. Gupta et al. (2005) introduced a design in the two-dimensional (2D) hydrophobic-polar (HP) model of Dill that can be used to approximate any given (2D) shape. They conjectured that the protein sequences of their design are stable but only proved the stability for an infinite class of very basic structures. We introduce a refinement of the HP model, in which the cysteine and non-cysteine hydrophobic monomers are distinguished and SS-bridges, which two cysteines can form, are taken into account in the energy function. We call this model the HPC model. We consider a subclass of linear structures designed in Gupta et al. (2005) which is rich enough to approximate (although more coarsely) any given structure. We refine these structures for the HPC model by setting approximately a half of H amino acids to cysteine ones and call them snake structures. We first prove that the proteins of the snake structures are stable under the strong HPC model in which we make an additional assumption that non-cysteine amino acids act as cysteine ones, i.e., they can form their own bridges to reduce the energy. Then we consider a subclass of snake structures called wave structures that can still approximate any given shape and prove that their proteins are stable under the proper HPC model. This partially confirms the conjecture stated in Gupta et al. (2005). To prove the above results we developed a computational tool, called 2DHPSolver, which we used to perform large case analysis required for the proofs. We conjecture that the proteins of snake structures are stable under the proper HPC model.

MeSH terms

  • Amino Acid Sequence
  • Computer Simulation*
  • Cysteine / chemistry*
  • Hydrophobic and Hydrophilic Interactions
  • Models, Molecular*
  • Protein Folding*
  • Protein Structure, Secondary*

Substances

  • Cysteine