Inferring expressed genes by whole-genome sequencing of plasma DNA

Nat Genet. 2016 Oct;48(10):1273-8. doi: 10.1038/ng.3648. Epub 2016 Aug 29.

Abstract

The analysis of cell-free DNA (cfDNA) in plasma represents a rapidly advancing field in medicine. cfDNA consists predominantly of nucleosome-protected DNA shed into the bloodstream by cells undergoing apoptosis. We performed whole-genome sequencing of plasma DNA and identified two discrete regions at transcription start sites (TSSs) where nucleosome occupancy results in different read depth coverage patterns for expressed and silent genes. By employing machine learning for gene classification, we found that the plasma DNA read depth patterns from healthy donors reflected the expression signature of hematopoietic cells. In patients with cancer having metastatic disease, we were able to classify expressed cancer driver genes in regions with somatic copy number gains with high accuracy. We were able to determine the expressed isoform of genes with several TSSs, as confirmed by RNA-seq analysis of the matching primary tumor. Our analyses provide functional information about cells releasing their DNA into the circulation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / blood*
  • Female
  • Gene Expression*
  • Genome, Human*
  • Humans
  • Male
  • Neoplasms / blood
  • Neoplasms / genetics
  • Nucleosomes / genetics
  • RNA / blood
  • Sequence Analysis, DNA
  • Transcription Initiation Site

Substances

  • Nucleosomes
  • RNA
  • DNA