An optimal kernel-based U-statistic method for quantitative gene-set association analysis

Genet Epidemiol. 2019 Mar;43(2):137-149. doi: 10.1002/gepi.22170. Epub 2018 Nov 19.

Abstract

Single-variant-based genome-wide association studies have successfully detected many genetic variants that are associated with a number of complex traits. However, their power is limited due to weak marginal signals and ignoring potential complex interactions among genetic variants. The set-based strategy was proposed to provide a remedy where multiple genetic variants in a given set (e.g., gene or pathway) are jointly evaluated, so that the systematic effect of the set is considered. Among many, the kernel-based testing (KBT) framework is one of the most popular and powerful methods in set-based association studies. Given a set of candidate kernels, the method has been proposed to choose the one with the smallest p-value. Such a method, however, can yield inflated Type 1 error, especially when the number of variants in a set is large. Alternatively one can get p values by permutations which, however, could be very time-consuming. In this study, we proposed an efficient testing procedure that cannot only control Type 1 error rate but also have power close to the one obtained under the optimal kernel in the candidate kernel set, for quantitative trait association studies. Our method, a maximum kernel-based U-statistic method, is built upon the KBT framework and is based on asymptotic results under a high-dimensional setting. Hence it can efficiently deal with the case where the number of variants in a set is much larger than the sample size. Both simulation and real data analysis demonstrate the advantages of the method compared with its counterparts.

Keywords: gene-set association; high dimension; multiple kernels; nonlinear effect; quantitative trait.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Genetic Association Studies / methods*
  • Genome-Wide Association Study
  • Humans
  • Infant, Newborn
  • Models, Genetic
  • Statistics as Topic*