ERASE-Seq: Leveraging replicate measurements to enhance ultralow frequency variant detection in NGS data

PLoS One. 2018 Apr 9;13(4):e0195272. doi: 10.1371/journal.pone.0195272. eCollection 2018.

Abstract

The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cell Line
  • Computational Biology
  • DNA Barcoding, Taxonomic / statistics & numerical data
  • Gene Frequency
  • Gene Library
  • Genetic Variation
  • High-Throughput Nucleotide Sequencing / statistics & numerical data*
  • Humans
  • INDEL Mutation
  • Sequence Analysis, DNA / statistics & numerical data*

Grants and funding

The studies were funded by Fluxion Biosciences, Swift Biosciences and Illumina. No individual authors received specific funding for this work. The affiliated companies provided support in the form of salaries for authors NKH, CIZ (Fluxion Biosciences), AM, LH, TH (Swift Biosciences), PP and CR (Illumina). The respective companies did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.