Efficacy assessment of SNP sets for genome-wide disease association studies

Nucleic Acids Res. 2007;35(17):e113. doi: 10.1093/nar/gkm621. Epub 2007 Aug 28.

Abstract

The power of a genome-wide disease association study depends critically upon the properties of the marker set used, particularly the number and physical spacing of markers, and the level of inter-marker association due to linkage disequilibrium. Extending our previously devised theoretical framework for the entropy-based selection of genetic markers, we have developed a local measure of the efficacy of a marker set, relative to including a maximally polymorphic single nucleotide polymorphism (SNP) at the map position of interest. Using this quantitative criterion, we evaluated five currently available SNP sets, namely Affymetrix 100K and 500K, and Illumina 100K, 300K and 550K in the CEU, YRI and JPT + CHB HapMap populations. At 50% relative efficacy, the commercial marker sets cover between 19 and 68% of the human genome, depending upon the population under study. An optimal technology-independent 500K marker set constructed from HapMap for Caucasians, in contrast, would achieve 73% coverage at the same relative efficacy.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping / methods*
  • Entropy
  • Genetic Diseases, Inborn / genetics*
  • Genetic Markers
  • Genome, Human*
  • Genomics / methods*
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide*

Substances

  • Genetic Markers