The role of internal node sequences and the molecular clock in the analysis of serially-sampled data

Int J Bioinform Res Appl. 2008;4(1):107-21. doi: 10.1504/IJBRA.2008.017167.

Abstract

Algorithms that infer phylogenetic relationships between serially-sampled sequences have been developed in recent years to assist in the analysis of rapidly-evolving human pathogens. Our study consisted of evaluating seven relevant methods using empirical as well as simulated data sets. In particular, we investigated how the molecular clock hypothesis affected their relative performance, as three of the algorithms that accept serially-sampled data as input assume a molecular clock. Our results show that the standard phylogenetic methods and MinPD had a better overall performance. Surprisingly, when all internal node sequences were included in the data, the topological performance measure of all the methods, with the exception of MinPD, dropped significantly.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Animals
  • Base Sequence
  • Biological Clocks / genetics*
  • Computer Simulation
  • DNA Mutational Analysis / methods*
  • DNA, Viral / genetics*
  • Evolution, Molecular*
  • Genetics, Population*
  • Humans
  • Models, Genetic*
  • Molecular Sequence Data
  • Phylogeny
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Viral