De novo semi-alignment of 16S rRNA gene sequences for deep phylogenetic characterization of next generation sequencing data

Microbes Environ. 2013;28(2):211-6. doi: 10.1264/jsme2.me12157. Epub 2013 Apr 20.

Abstract

We addressed the challenges of analyzing next-generation 16S rRNA gene deep sequencing data from the uncharacterized microbial majority. This was performed using a novel de novo semi-alignment approach. The semi-alignments were based on Orthologous Tri-Nucleotides (OTNs), which are identical trinucleotides located in the same sequence region. OTNs in high error homopolymeric tracts were excluded to avoid overestimation of genetic distances. Phylogenetic information was derived assuming an exponential decay in shared OTNs between pairs of bacteria. OTN relatedness was also explored through principal component analysis (PCA). In evaluating the OTN approach we reanalyzed a dataset consisting of triplicate GS FLX titanium pyrosequencing runs for each of two experimental soil samples, in addition to analyses of the Greengenes core dataset. The conclusion from these comparisons was that the OTN approach was superior to traditional alignments both with respect to speed and accuracy. We therefore believe that our OTN-based semi-alignment approach will be a valuable contribution to future exploration of deep sequencing data.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / classification*
  • Bacteria / genetics*
  • Cluster Analysis
  • Computational Biology / methods*
  • Genes, rRNA*
  • Genetic Variation*
  • High-Throughput Nucleotide Sequencing / methods*
  • Phylogeny
  • RNA, Ribosomal, 16S / genetics*
  • Sequence Alignment
  • Sequence Homology, Nucleic Acid
  • Time Factors

Substances

  • RNA, Ribosomal, 16S