Accurate detection of subclonal single nucleotide variants in whole genome amplified and pooled cancer samples using HaloPlex target enrichment

BMC Genomics. 2013 Dec 5;14(1):856. doi: 10.1186/1471-2164-14-856.

Abstract

Background: Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing.

Results: We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792-1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools.

Conclusion: Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Child
  • Child, Preschool
  • Gene Frequency
  • Genome, Human*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Germ Cells / metabolism
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Neoplasms / genetics*
  • Polymorphism, Single Nucleotide*
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / genetics
  • Reproducibility of Results
  • Sensitivity and Specificity