Metabolic network-guided binning of metagenomic sequence fragments

Bioinformatics. 2016 Mar 15;32(6):867-74. doi: 10.1093/bioinformatics/btv671. Epub 2015 Nov 14.

Abstract

Motivation: Most microbes on Earth have never been grown in a laboratory, and can only be studied through DNA sequences. Environmental DNA sequence samples are complex mixtures of fragments from many different species, often unknown. There is a pressing need for methods that can reliably reconstruct genomes from complex metagenomic samples in order to address questions in ecology, bioremediation, and human health.

Results: We present the SOrting by NEtwork Completion (SONEC) approach for assigning reactions to incomplete metabolic networks based on a metabolite connectivity score. We successfully demonstrate proof of concept in a set of 100 genome-scale metabolic network reconstructions, and delineate the variables that impact reaction assignment accuracy. We further demonstrate the integration of SONEC with existing approaches (such as cross-sample scaffold abundance profile clustering) on a set of 94 metagenomic samples from the Human Microbiome Project. We show that not only does SONEC aid in reconstructing species-level genomes, but it also improves functional predictions made with the resulting metabolic networks.

Availability and implementation: The datasets and code presented in this work are available at: https://bitbucket.org/mattbiggs/sorting_by_network_completion/

Contact: papin@virginia.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Humans
  • Metabolic Networks and Pathways
  • Metagenomics*
  • Microbiota