Generalized functional linear models for gene-based case-control association studies

Genet Epidemiol. 2014 Nov;38(7):622-637. doi: 10.1002/gepi.21840. Epub 2014 Sep 9.

Abstract

By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses.

Keywords: case-control association studies; common variants; complex diseases; functional data analysis; generalized functional linear models; logistic regression; rare variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural

MeSH terms

  • Case-Control Studies*
  • Exome
  • Gene Frequency
  • Genes
  • Genetic Association Studies*
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study
  • Hirschsprung Disease / genetics
  • Humans
  • Linear Models
  • Models, Genetic*
  • Neural Tube Defects / genetics
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Software