The impact of HGT on phylogenomic reconstruction methods

Brief Bioinform. 2014 Jan;15(1):79-90. doi: 10.1093/bib/bbs050. Epub 2012 Aug 20.

Abstract

Supermatrix and supertree analyses are frequently used to more accurately recover vertical evolutionary history but debate still exists over which method provides greater reliability. Traditional methods that resolve relationships among organisms from single genes are often unreliable because of the frequent lack of strong phylogenetic signal and the presence of systematic artifacts. Methods developed to reconstruct organismal history from multiple genes can be divided into supermatrix and supertree approaches. A supermatrix analysis consists of the concatenation of multiple genes into a single, possibly partitioned alignment, from which phylogenies are reconstructed using a variety of approaches. Supertrees build consensus trees from the topological information contained within individual gene trees. Both methods are now widely used and have been demonstrated to solve previously ambiguous or unresolved phylogenies with high statistical support. However, the amount of misleading signal needed to induce erroneous phylogenies for both strategies is still unknown. Using genome simulations, we test the accuracy of supertree and supermatrix approaches in recovering the true organismal phylogeny under increased amounts of horizontally transferred genes and changes in substitution rates. Our results show that overall, supermatrix approaches are preferable when a low amount of gene transfer is suspected to be present in the dataset, while supertrees have greater reliability in the presence of a moderate amount of misleading gene transfers. In the face of very high or very low substitution rates without horizontal gene transfers, supermatrix approaches outperform supertrees as individual gene trees remain unresolved and additional sequences contribute to a congruent phylogenetic signal.

Keywords: concatenation; horizontal gene transfer; phylogenomics; quartets; supermatrix; supertree.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Computer Simulation
  • Evolution, Molecular
  • Gene Transfer, Horizontal*
  • Genomics / statistics & numerical data
  • Models, Genetic*
  • Phylogeny*
  • Sequence Alignment / statistics & numerical data