Inferring strain-level mutational drivers of phage-bacteria interaction phenotypes

bioRxiv [Preprint]. 2024 Jan 9:2024.01.08.574707. doi: 10.1101/2024.01.08.574707.

Abstract

The enormous diversity of bacteriophages and their bacterial hosts presents a significant challenge to predict which phages infect a focal set of bacteria. Infection is largely determined by complementary -and largely uncharacterized- genetics of adsorption, injection, and cell take-over. Here we present a machine learning (ML) approach to predict phage-bacteria interactions trained on genome sequences of and phenotypic interactions amongst 51 Escherichia coli strains and 45 phage λ strains that coevolved in laboratory conditions for 37 days. Leveraging multiple inference strategies and without a priori knowledge of driver mutations, this framework predicts both who infects whom and the quantitative levels of infections across a suite of 2,295 potential interactions. The most effective ML approach inferred interaction phenotypes from independent contributions from phage and bacteria mutations, predicting phage host range with 86% mean classification accuracy while reducing the relative error in the estimated strength of the infection phenotype by 40%. Further, transparent feature selection in the predictive model revealed 18 of 176 phage λ and 6 of 18 E. coli mutations that have a significant influence on the outcome of phage-bacteria interactions, corroborating sites previously known to affect phage λ infections, as well as identifying mutations in genes of unknown function not previously shown to influence bacterial resistance. While the genetic variation studied was limited to a focal, coevolved phage-bacteria system, the method's success at recapitulating strain-level infection outcomes provides a path forward towards developing strategies for inferring interactions in non-model systems, including those of therapeutic significance.

Publication types

  • Preprint