Predicting relatedness of bacterial genomes using the chaperonin-60 universal target (cpn60 UT): application to Thermoanaerobacter species

Syst Appl Microbiol. 2011 May;34(3):171-9. doi: 10.1016/j.syapm.2010.11.019. Epub 2011 Mar 9.

Abstract

D.R. Zeigler determined that the sequence identity of bacterial genomes can be predicted accurately using the sequence identities of a corresponding set of genes that meet certain criteria [32]. This three-gene model for comparing bacterial genome pairs requires the determination of the sequence identities for recN, thdF, and rpoA. This involves the generation of approximately 4.2kb of genomic DNA sequence from each organism to be compared, and also normally requires that oligonucleotide primers be designed for amplification and sequencing based on the sequences of closely related organisms. However, we have developed an analogous mathematical model for predicting the sequence identity of whole genomes based on the sequence identity of the 542-567 base pair chaperonin-60 universal target (cpn60 UT). The cpn60 UT is accessible in nearly all bacterial genomes with a single set of universal primers, and its length is such that it can be completely sequenced in one pair of overlapping sequencing reads via di-deoxy sequencing. These mathematical models were applied to a set of Thermoanaerobacter isolates from a wood chip compost pile and it was shown that both the one-gene cpn60 UT-based model and the three-gene model based on recN, rpoA, and thdF predicted that these isolates could be classified as Thermoanaerobacter thermohydrosulfuricus. Furthermore, it was found that the genomic prediction model using cpn60 UT gave similar results to whole-genome sequence alignments over a broad range of taxa, suggesting that this method may have general utility for screening isolates and predicting their taxonomic affiliations.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Chaperonin 60 / genetics*
  • DNA Primers / genetics
  • DNA, Bacterial / genetics*
  • DNA, Bacterial / isolation & purification
  • Genome, Bacterial* / genetics
  • Linear Models
  • Models, Genetic
  • Molecular Sequence Data
  • Phylogeny
  • Polymerase Chain Reaction / methods
  • Sequence Alignment
  • Sequence Analysis, DNA
  • Thermoanaerobacter / classification*
  • Thermoanaerobacter / genetics

Substances

  • Bacterial Proteins
  • Chaperonin 60
  • DNA Primers
  • DNA, Bacterial

Associated data

  • GENBANK/HM623896
  • GENBANK/HM623897
  • GENBANK/HM623898
  • GENBANK/HM623899
  • GENBANK/HM623900
  • GENBANK/HM623901
  • GENBANK/HM623902
  • GENBANK/HM623903
  • GENBANK/HM623904
  • GENBANK/HM623905
  • GENBANK/HM623906
  • GENBANK/HM623907
  • GENBANK/HM623910
  • GENBANK/HQ153761
  • GENBANK/HQ153762
  • GENBANK/HQ153763
  • GENBANK/HQ153764
  • GENBANK/HQ153765
  • GENBANK/HQ153766
  • GENBANK/HQ153767
  • GENBANK/HQ153768
  • GENBANK/HQ153769
  • GENBANK/HQ153770
  • GENBANK/HQ153771
  • GENBANK/HQ153772
  • GENBANK/HQ153773
  • GENBANK/HQ153774
  • GENBANK/HQ153775
  • GENBANK/HQ153776
  • GENBANK/HQ153777
  • GENBANK/HQ153778
  • GENBANK/HQ153779
  • GENBANK/HQ153780
  • GENBANK/HQ153781
  • GENBANK/HQ153782
  • GENBANK/HQ153783
  • GENBANK/HQ153784
  • GENBANK/HQ153785
  • GENBANK/HQ153786
  • GENBANK/HQ153787
  • GENBANK/HQ153788
  • GENBANK/HQ153789
  • GENBANK/HQ153790
  • GENBANK/HQ153791
  • GENBANK/HQ153792
  • GENBANK/HQ153793
  • GENBANK/HQ153794
  • GENBANK/HQ153795
  • GENBANK/HQ153796