Latent environment allocation of microbial community data

PLoS Comput Biol. 2018 Jun 6;14(6):e1006143. doi: 10.1371/journal.pcbi.1006143. eCollection 2018 Jun.

Abstract

As data for microbial community structures found in various environments has increased, studies have examined the relationship between environmental labels given to retrieved microbial samples and their community structures. However, because environments continuously change over time and space, mixed states of some environments and its effects on community formation should be considered, instead of evaluating effects of discrete environmental categories. Here we applied a hierarchical Bayesian model to paired datasets containing more than 30,000 samples of microbial community structures and sample description documents. From the training results, we extracted latent environmental topics that associate co-occurring microbes with co-occurring word sets among samples. Topics are the core elements of environmental mixtures and the visualization of topic-based samples clarifies the connections of various environments. Based on the model training results, we developed a web application, LEA (Latent Environment Allocation), which provides the way to evaluate typicality and heterogeneity of microbial communities in newly obtained samples without confining environmental categories to be compared. Because topics link words and microbes, LEA also enables to search samples semantically related to the query out of 30,000 microbiome samples.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Environmental Microbiology
  • Humans
  • Metagenomics
  • Microbiota* / genetics
  • Microbiota* / physiology
  • Models, Statistical
  • Rivers / microbiology

Grants and funding

This work was supported by The Japan Society for the Promotion of Science KAKENHI Grant Number 16H06279 and the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.