Quality control metrics improve repeatability and reproducibility of single-nucleotide variants derived from whole-genome sequencing

Pharmacogenomics J. 2015 Aug;15(4):298-309. doi: 10.1038/tpj.2014.70. Epub 2014 Nov 11.

Abstract

Although many quality control (QC) methods have been developed to improve the quality of single-nucleotide variants (SNVs) in SNV-calling, QC methods for use subsequent to single-nucleotide polymorphism-calling have not been reported. We developed five QC metrics to improve the quality of SNVs using the whole-genome-sequencing data of a monozygotic twin pair from the Korean Personal Genome Project. The QC metrics improved both repeatability between the monozygotic twin pair and reproducibility between SNV-calling pipelines. We demonstrated the QC metrics improve reproducibility of SNVs derived from not only whole-genome-sequencing data but also whole-exome-sequencing data. The QC metrics are calculated based on the reference genome used in the alignment without accessing the raw and intermediate data or knowing the SNV-calling details. Therefore, the QC metrics can be easily adopted in downstream association analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Twin Study

MeSH terms

  • Algorithms
  • Base Sequence
  • Chromosome Mapping
  • Genome, Human / genetics*
  • Genome-Wide Association Study / standards*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Polymorphism, Single Nucleotide / genetics*
  • Quality Control
  • Reproducibility of Results
  • Republic of Korea
  • Twins, Monozygotic