Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates

Nucleic Acids Res. 2006 Apr 20;34(7):e55. doi: 10.1093/nar/gkl136.

Abstract

Array based DNA pooling techniques facilitate genome-wide scale genotyping of large samples. We describe a structured analysis method for pooled data using internal replication information in large scale genotyping sets. The method takes advantage of information from single nucleotide polymorphisms (SNPs) typed in parallel on a high density array to construct a test statistic with desirable statistical properties. We utilize a general linear model to appropriately account for the structured multiple measurements available with array data. The method does not require the use of additional arrays for the estimation of unequal hybridization rates and hence scales readily to accommodate arrays with several hundred thousand SNPs. Tests for differences between cases and controls can be conducted with very few arrays. We demonstrate the method on 384 endometriosis cases and controls, typed using Affymetrix Genechip(c) HindIII 50 K arrays. For a subset of this data there were accurate measures of hybridization rates available. Assuming equal hybridization rates is shown to have a negligible effect upon the results. With a total of only six arrays, the method extracted one-third of the information (in terms of equivalent sample size) available with individual genotyping (requiring 768 arrays). With 20 arrays (10 for cases, 10 for controls), over half of the information could be extracted from this sample.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Case-Control Studies
  • DNA / chemistry
  • Endometriosis / genetics
  • Female
  • Gene Frequency*
  • Genomics / methods
  • Genotype
  • Humans
  • Linear Models
  • Oligonucleotide Array Sequence Analysis / methods*
  • Polymorphism, Single Nucleotide*

Substances

  • DNA