A single cell level based method for copy number variation analysis by low coverage massively parallel sequencing

PLoS One. 2013;8(1):e54236. doi: 10.1371/journal.pone.0054236. Epub 2013 Jan 23.

Abstract

Copy number variations (CNVs), a common genomic mutation associated with various diseases, are important in research and clinical applications. Whole genome amplification (WGA) and massively parallel sequencing have been applied to single cell CNVs analysis, which provides new insight for the fields of biology and medicine. However, the WGA-induced bias significantly limits sensitivity and specificity for CNVs detection. Addressing these limitations, we developed a practical bioinformatic methodology for CNVs detection at the single cell level using low coverage massively parallel sequencing. This method consists of GC correction for WGA-induced bias removal, binary segmentation algorithm for locating CNVs breakpoints, and dynamic threshold determination for final signals filtering. Afterwards, we evaluated our method with seven test samples using low coverage sequencing (4∼9.5%). Four single-cell samples from peripheral blood, whose karyotypes were confirmed by whole genome sequencing analysis, were acquired. Three other test samples derived from blastocysts whose karyotypes were confirmed by SNP-array analysis were also recruited. The detection results for CNVs of larger than 1 Mb were highly consistent with confirmed results reaching 99.63% sensitivity and 97.71% specificity at base-pair level. Our study demonstrates the potential to overcome WGA-bias and to detect CNVs (>1 Mb) at the single cell level through low coverage massively parallel sequencing. It highlights the potential for CNVs research on single cells or limited DNA samples and may prove as a promising tool for research and clinical applications, such as pre-implantation genetic diagnosis/screening, fetal nucleated red blood cells research and cancer heterogeneity analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Base Composition
  • Blastocyst / cytology
  • Blastocyst / metabolism
  • Chromosome Mapping
  • DNA Copy Number Variations*
  • Gene Dosage
  • Genome, Human*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Karyotyping
  • Leukocytes, Mononuclear / cytology
  • Leukocytes, Mononuclear / metabolism
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Sequence Analysis, DNA / statistics & numerical data
  • Single-Cell Analysis / methods*
  • Software*

Grants and funding

This project is supported by Key Laboratory Project in Guangdong Province, (2011A060906007) and Key laboratory Project in Shenzhen (Shenzhen Municipal Commission of development and Reform [2011] No. 861). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.