Replication strategies for rare variant complex trait association studies via next-generation sequencing

Am J Hum Genet. 2010 Dec 10;87(6):790-801. doi: 10.1016/j.ajhg.2010.10.025.

Abstract

There is solid evidence that complex traits can be caused by rare variants. Next-generation sequencing technologies are powerful tools for mapping rare variants. Confirmation of significant findings in stage 1 through replication in an independent stage 2 sample is necessary for association studies. For gene-based mapping of rare variants, two replication strategies are possible: (1) variant-based replication, wherein only variants from nucleotide sites uncovered in stage 1 are genotyped and followed-up and (2) sequence-based replication, wherein the gene region is sequenced in the replication sample and both known and novel variants are tested. The efficiency of the two strategies is dependent on the proportions of causative variants discovered in stage 1 and sequencing/genotyping errors. With rigorous population genetic and phenotypic models, it is demonstrated that sequence-based replication is consistently more powerful. However, the power gain is small (1) for large-scale studies with thousands of individuals, because a large fraction of causative variant sites can be observed and (2) for small- to medium-scale studies with a few hundred samples, because a large proportion of the locus population attributable risk can be explained by the uncovered variants. Therefore, genotyping can be a temporal solution for replicating genetic studies if stage 1 and 2 samples are drawn from the same population. However, sequence-based replication is advantageous if the stage 1 sample is small or novel variants discovery is also of interest. It is shown that currently attainable levels of sequencing error only minimally affect the comparison, and the advantage of sequence-based replication remains.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Genetic Predisposition to Disease*
  • Genome-Wide Association Study*
  • Humans
  • Models, Genetic
  • Probability
  • Sequence Analysis, DNA*