Robustness of Phylogenetic Inference to Model Misspecification Caused by Pairwise Epistasis

Mol Biol Evol. 2021 Sep 27;38(10):4603-4615. doi: 10.1093/molbev/msab163.

Abstract

Likelihood-based phylogenetic inference posits a probabilistic model of character state change along branches of a phylogenetic tree. These models typically assume statistical independence of sites in the sequence alignment. This is a restrictive assumption that facilitates computational tractability, but ignores how epistasis, the effect of genetic background on mutational effects, influences the evolution of functional sequences. We consider the effect of using a misspecified site-independent model on the accuracy of Bayesian phylogenetic inference in the setting of pairwise-site epistasis. Previous work has shown that as alignment length increases, tree reconstruction accuracy also increases. Here, we present a simulation study demonstrating that accuracy increases with alignment size even if the additional sites are epistatically coupled. We introduce an alignment-based test statistic that is a diagnostic for pairwise epistasis and can be used in posterior predictive checks.

Keywords: epistasis; model adequacy; phylogenetics; posterior predictive simulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Epistasis, Genetic
  • Evolution, Molecular*
  • Likelihood Functions
  • Models, Genetic*
  • Phylogeny