Evaluation of the SNP tagging approach in an independent population sample--array-based SNP discovery in Sami

Hum Genet. 2007 Sep;122(2):141-50. doi: 10.1007/s00439-007-0379-2. Epub 2007 Jun 7.

Abstract

Significant efforts have been made to determine the correlation structure of common SNPs in the human genome. One method has been to identify the sets of tagSNPs that capture most of the genetic variation. Here, we evaluate the transferability of tagSNPs between populations using a population sample of Sami, the indigenous people of Scandinavia. Array-based SNP discovery in a 4.4 Mb region of 28 phased copies of chromosome 21 uncovered 5,132 segregating sites, 3,188 of which had a minimum minor allele frequency (mMAF) of 0.1. Due to the population structure and consequently high LD, the number of tagSNPs needed to capture all SNP variation in Sami is much lower than that for the HapMap populations. TagSNPs identified from the HapMap data perform only slightly better in the Sami than choosing tagSNPs at random from the same set of common SNPs. Surprisingly, tagSNPs defined from the HapMap data did not perform better than selecting the same number of SNPs at random from all SNPs discovered in Sami. Nearly half (46%) of the Sami SNPs with a mMAF of 0.1 are not present in the HapMap dataset. Among sites overlapping between Sami and HapMap populations, 18% are not tagged by the European American (CEU) HapMap tagSNPs, while 43% of the SNPs that are unique to Sami are not tagged by the CEU tagSNPs. These results point to serious limitations in the transferability of common tagSNPs to capture random sequence variation, even between closely related populations, such as CEU and Sami.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Human, Pair 21 / genetics*
  • Databases, Genetic
  • Ethnicity / genetics*
  • Gene Frequency
  • Genetic Variation*
  • Haplotypes / genetics
  • Humans
  • Microsatellite Repeats / genetics
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide / genetics*
  • Sweden