The effect of rare variants on inflation of the test statistics in case-control analyses

BMC Bioinformatics. 2015 Feb 20:16:53. doi: 10.1186/s12859-015-0496-1.

Abstract

Background: The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data.

Results: We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency.

Conclusions: In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Case-Control Studies
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Gene Frequency
  • Genetic Association Studies
  • Genetic Markers
  • Genetic Predisposition to Disease
  • Genetic Variation / genetics*
  • Heterozygote
  • Humans
  • Models, Genetic*
  • Models, Statistical*

Substances

  • Genetic Markers