Variation of gene-based SNPs and linkage disequilibrium patterns in the human genome

Hum Mol Genet. 2004 Aug 1;13(15):1623-32. doi: 10.1093/hmg/ddh177. Epub 2004 Jun 9.

Abstract

A principal goal in human genetics is to provide the tools necessary to enable genome-wide association studies. Extensive information on the distribution of gene-based single-nucleotide polymorphisms (SNPs) and linkage disequilibrium (LD) patterns across the genome is required in order to choose markers for efficient implementation of this approach. To obtain such information, we have genotyped a large Japanese cohort for SNPs identified by systematic resequencing of more than 14 000 autosomal genes. Analysis of these data led to the conclusion that the Japanese population contains approximately 130 000 common autosomal gene haplotypes (frequency >0.05), of which more than 35% are identified in the present study. We also examined allele frequencies and LD patterns according to the position of variants within genes, and their distribution across the genome. We found lower allele variability at exonic SNP sites (both non-synonymous and synonymous) compared with non-exonic SNP sites, and greater average LD between SNPs within exons of the same gene compared with other SNP combinations, both of which could be signals of selection. LD was correlated with the recombination rate per physical distance as estimated from the meiotic map, but the strength of the relationship varied considerably in different regions of the genome. Unique LD patterns, characterized by frequent instances of high LD between non-adjacent SNPs punctuated by blocks of low LD, were found in a 7 Mb region on chromosome 6p that includes the MHC (major histocompatibility complex) locus and many non-MHC genes. These results demonstrate the complexity that must be taken into account when considering SNP variability and LD patterns, while also providing tools necessary for implementation of efficient genome-wide association studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosomes, Human, Pair 6
  • Gene Frequency
  • Genome, Human*
  • Haplotypes
  • Humans
  • Linkage Disequilibrium*
  • Polymorphism, Single Nucleotide*