Empirical power of very rare variants for common traits and disease: results from sanger sequencing 1998 individuals

Eur J Hum Genet. 2013 Sep;21(9):1027-30. doi: 10.1038/ejhg.2012.284. Epub 2013 Jan 16.

Abstract

The optimal study design for identifying rare variants associated with common disease is not yet clear and researchers have to decide whether to prioritize lower sequencing coverage on larger sample sizes, or higher coverage on smaller sample sizes. High-coverage sequencing affords several advantages, such as genotype accuracy and improved identification of very rare variants, but this comes at increased cost. However, the magnitude of the contribution of very rare variants to the statistical power of gene-based association tests is unknown. By using Sanger sequence data on seven genes from 1998 subjects with simulated phenotypes, we provide evidence that excluding very rare variants, in general, reduces the statistical power of rare variant association tests only modestly. However, if the probability of being causal and the effect size of the causal variants are inversely related to the minor allele frequency, then very rare variants do contribute to some power, however the absolute power remains low. As very rare variants constitute the majority of variants identified in sequencing studies, these findings suggest that careful attention need to be placed on the plausible relationship that exist between very rare variants and common disease.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical
  • Empirical Research
  • Genes
  • Genetic Association Studies / methods*
  • Genetic Predisposition to Disease
  • Genetic Variation
  • Humans
  • Models, Genetic
  • Sequence Analysis, DNA*