Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

BMC Bioinformatics. 2007 Nov 28:8:464. doi: 10.1186/1471-2105-8-464.

Abstract

Background: The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved.

Results: This paper describes an algorithm, SSCA, which measures the suitability of sequences for the comparative approach. It is based on evolutionary models with structure constraints, particularly those on sequence variations and stem alignment. We propose three models, based on different constraints on sequence alignments. We show the results of the SSCA algorithm for predicting the secondary structure of several RNAs. SSCA enabled us to choose sets of homologous sequences that gave better predictions than arbitrarily chosen sets of homologous sequences.

Conclusion: SSCA is an algorithm for selecting combinations of RNA homologous sequences suitable for secondary structure predictions with the comparative approach.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms*
  • Base Pairing / genetics*
  • Base Sequence
  • Computational Biology / methods*
  • Escherichia coli / genetics
  • Evolution, Molecular
  • Genetic Variation
  • Models, Genetic*
  • RNA / genetics*
  • Sequence Alignment
  • Sequence Homology

Substances

  • RNA