Fast and accurate recurrent event analysis for genome-wide association studies

Genet Epidemiol. 2023 Jul;47(5):365-378. doi: 10.1002/gepi.22525. Epub 2023 Apr 15.

Abstract

Many diseases recur after recovery, for example, recurrences in cancer and infections. However, research is often focused on analysing only time-to-first recurrence, thereby ignoring any subsequent recurrences that may occur after the first. Statistical models for the analysis of recurrent events are available, of which the extended Cox proportional hazards frailty model is the current state-of-the-art. However, this model is too statistically complex for computationally efficient application in high-dimensional data sets, including genome-wide association studies (GWAS). Here, we develop an application for fast and accurate recurrent event analysis in GWAS, called SPARE (SaddlePoint Approximation for Recurrent Event analysis). In SPARE, every DNA variant is tested for association with recurrence risk using a modified score statistic. A saddlepoint approximation is implemented to achieve statistical accuracy. SPARE controls the Type I error, and its statistical power is similar to existing recurrent event models, yet SPARE is significantly faster. An application of SPARE in a recurrent event GWAS on bladder cancer for 6.2 million DNA variants in 1,443 individuals required less than 15 min, whereas existing recurrent event methods would require several weeks.

Keywords: GWAS; frailty; martingale residuals; recurrent events; saddlepoint approximation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome-Wide Association Study*
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Neoplasm Recurrence, Local*
  • Proportional Hazards Models