Simultaneous alignment and folding of protein sequences

J Comput Biol. 2014 Jul;21(7):477-91. doi: 10.1089/cmb.2013.0163. Epub 2014 Apr 25.

Abstract

Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We present partiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm's complexity is polynomial in time and space. Algorithmically, partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments, partiFold-Align significantly outperforms state-of-the-art pairwise and multiple sequence alignment tools in the most difficult low-sequence homology case. It also improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families (partiFold-Align is available at http://partifold.csail.mit.edu/ ).

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Computational Biology*
  • Humans
  • Membrane Proteins / chemistry*
  • Models, Molecular
  • Protein Folding*
  • Protein Structure, Secondary
  • Sequence Alignment*
  • Software

Substances

  • Membrane Proteins