Renewing Felsenstein's phylogenetic bootstrap in the era of big data

Nature. 2018 Apr;556(7702):452-456. doi: 10.1038/s41586-018-0043-0. Epub 2018 Apr 18.

Abstract

Felsenstein's application of the bootstrap method to evolutionary trees is one of the most cited scientific papers of all time. The bootstrap method, which is based on resampling and replications, is used extensively to assess the robustness of phylogenetic inferences. However, increasing numbers of sequences are now available for a wide variety of species, and phylogenies based on hundreds or thousands of taxa are becoming routine. With phylogenies of this size Felsenstein's bootstrap tends to yield very low supports, especially on deep branches. Here we propose a new version of the phylogenetic bootstrap in which the presence of inferred branches in replications is measured using a gradual 'transfer' distance rather than the binary presence or absence index used in Felsenstein's original version. The resulting supports are higher and do not induce falsely supported branches. The application of our method to large mammal, HIV and simulated datasets reveals their phylogenetic signals, whereas Felsenstein's bootstrap fails to do so.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Computer Simulation
  • DNA Barcoding, Taxonomic
  • Data Interpretation, Statistical*
  • Datasets as Topic*
  • HIV-1 / genetics*
  • Haplorhini / genetics
  • Mammals / genetics*
  • Phylogeny*
  • pol Gene Products, Human Immunodeficiency Virus / chemistry
  • pol Gene Products, Human Immunodeficiency Virus / genetics

Substances

  • pol Gene Products, Human Immunodeficiency Virus