Detection of common single nucleotide polymorphisms synthesizing quantitative trait association of rarer causal variants

Genome Res. 2011 Jul;21(7):1122-30. doi: 10.1101/gr.115832.110. Epub 2011 Mar 25.

Abstract

Genome-wide association (GWA) studies have identified hundreds of common (minor allele frequency ≥5%) single nucleotide polymorphisms (SNPs) associated with phenotype traits or diseases, yet causal variants accounting for the association signals have rarely been determined. A question then raised is whether a GWA signal represents an "indirect association" as a proxy of a strongly correlated causal variant with similar frequency, or a "synthetic association" of one or more rarer causal variants in linkage disequilibrium (D' ≈ 1, but r(2) not large); answering the question generally requires extensive resequencing and association analysis. Instead, we propose to test statistically whether a quantitative trait (QT) association of an SNP represents a synthetic association or not by inspecting the QT distribution at each genotype, not requiring the causal variant(s) to be known. We devised two test statistics and assessed the power by mathematical analysis and simulation. Testing the heterogeneity of variance was powerful when low-frequency causal alleles are linked mostly to one SNP allele, while testing the skewness outperformed when the causal alleles are linked evenly to either of the SNP alleles. By testing a statistic combining these two in 5000 individuals, we could detect synthetic association of a GWA signal when causal alleles sum up to 3% in frequency. Such signal only partially explains the heritability contributed by the whole locus. The proposed test is useful for designing fine mapping after studying association of common SNPs exhaustively; we can prioritize which GWA signal and which individuals to be resequenced, and identify the causal variants efficiently.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Apolipoproteins E / genetics
  • Computer Simulation
  • Databases, Genetic
  • Gene Frequency
  • Genetic Heterogeneity
  • Genome-Wide Association Study / methods
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Models, Statistical*
  • Phenotype
  • Polymorphism, Single Nucleotide*
  • Quantitative Trait Loci*

Substances

  • Apolipoproteins E