CNVineta: a data mining tool for large case-control copy number variation datasets

Bioinformatics. 2010 Sep 1;26(17):2208-9. doi: 10.1093/bioinformatics/btq356. Epub 2010 Jul 6.

Abstract

Motivation: Copy number variation (CNV), a major contributor to human genetic variation, comprises >/= 1 kb genomic deletions and insertions. Yet, the identification of CNVs from microarray data is still hampered by high false negative and positive prediction rates due to the noisy nature of the raw data. Here, we present CNVineta, an R package for rapid data mining and visualization of CNVs in large case-control datasets genotyped with single nucleotide polymorphism oligonucleotide arrays. CNVineta is compatible with various established CNV prediction algorithms, can be used for genome-wide association analysis of rare and common CNVs and enables rapid and serial display of log(2) of raw data ratios as well as B-allele frequencies for visual quality inspection. In summary, CNVineta aides in the interpretation of large-scale CNV datasets and prioritization of target regions for follow-up experiments.

Availability and implementation: CNVineta is available as an R package and can be downloaded from http://www.ikmb.uni-kiel.de/CNVineta/; the package contains a tutorial outlining a typical workflow. The CNVineta compatible HapMap dataset can also be downloaded from the link above.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Mining*
  • Gene Dosage*
  • Gene Frequency
  • Genetic Variation*
  • Genome, Human
  • Genome-Wide Association Study
  • Genotype
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide
  • Software*