CNVineta: a data mining tool for large case-control copy number variation datasets

Michael Wittig; Ingo Helbig; Stefan Schreiber; Andre Franke

doi:10.1093/bioinformatics/btq356

CNVineta: a data mining tool for large case-control copy number variation datasets

Bioinformatics. 2010 Sep 1;26(17):2208-9. doi: 10.1093/bioinformatics/btq356. Epub 2010 Jul 6.

Authors

Michael Wittig¹, Ingo Helbig, Stefan Schreiber, Andre Franke

Affiliation

¹ Institute of Clinical Molecular Biology, Christian-Albrechts-University Kiel, Schittenhelmstrasse 12, 24105 Kiel, Germany. m.wittig@mucosa.de

Abstract

Motivation: Copy number variation (CNV), a major contributor to human genetic variation, comprises >/= 1 kb genomic deletions and insertions. Yet, the identification of CNVs from microarray data is still hampered by high false negative and positive prediction rates due to the noisy nature of the raw data. Here, we present CNVineta, an R package for rapid data mining and visualization of CNVs in large case-control datasets genotyped with single nucleotide polymorphism oligonucleotide arrays. CNVineta is compatible with various established CNV prediction algorithms, can be used for genome-wide association analysis of rare and common CNVs and enables rapid and serial display of log(2) of raw data ratios as well as B-allele frequencies for visual quality inspection. In summary, CNVineta aides in the interpretation of large-scale CNV datasets and prioritization of target regions for follow-up experiments.

Availability and implementation: CNVineta is available as an R package and can be downloaded from http://www.ikmb.uni-kiel.de/CNVineta/; the package contains a tutorial outlining a typical workflow. The CNVineta compatible HapMap dataset can also be downloaded from the link above.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Data Mining*
Gene Dosage*
Gene Frequency
Genetic Variation*
Genome, Human
Genome-Wide Association Study
Genotype
Humans
Oligonucleotide Array Sequence Analysis
Polymorphism, Single Nucleotide
Software*