Data rotation improves genomotyping efficiency

Biom J. 2005 Aug;47(4):585-98. doi: 10.1002/bimj.200410160.

Abstract

Unsequenced bacterial strains can be characterized by comparing their genomic DNA to a sequenced reference genome of the same species. This comparative genomic approach, also called genomotyping, is leading to an increased understanding of bacterial evolution and pathogenesis. It is efficiently accomplished by comparative genomic hybridization on custom-designed cDNA microarrays. The microarray experiment results in fluorescence intensities for reference and sample genome for each gene. The log-ratio of these intensities is usually compared to a cut-off, classifying each gene of the sample genome as a candidate for an absent or present gene with respect to the reference genome. Reducing the usually high rate of false positives in the list of candidates for absent genes is decisive for both time and costs of the experiment. We propose a novel method to improve efficiency of genomotyping experiments in this sense, by rotating the normalized intensity data before setting up the list of candidate genes. We analyze simulated genomotyping data and also re-analyze an experimental data set for comparison and illustration. We approximately halve the proportion of false positives in the list of candidate absent genes for the example comparative genomic hybridization experiment as well as for the simulation experiments.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Genome, Bacterial*
  • Genotype
  • In Situ Hybridization, Fluorescence / methods*
  • Models, Genetic
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Reproducibility of Results
  • Sensitivity and Specificity