Phylogenetic inference with weighted codon evolutionary distances

J Mol Evol. 2009 Apr;68(4):377-92. doi: 10.1007/s00239-009-9212-y. Epub 2009 Mar 24.

Abstract

We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.

MeSH terms

  • Algorithms
  • Candida / genetics
  • Codon / genetics*
  • Computer Simulation
  • Evolution, Molecular*
  • Models, Genetic*
  • Models, Statistical*
  • Phylogeny*
  • Saccharomyces / genetics
  • Sequence Alignment

Substances

  • Codon