Pathway analysis for genome-wide genetic variation data: Analytic principles, latest developments, and new opportunities

J Genet Genomics. 2021 Mar 20;48(3):173-183. doi: 10.1016/j.jgg.2021.01.007. Epub 2021 Feb 26.

Abstract

Pathway analysis, also known as gene-set enrichment analysis, is a multilocus analytic strategy that integrates a priori, biological knowledge into the statistical analysis of high-throughput genetics data. Originally developed for the studies of gene expression data, it has become a powerful analytic procedure for in-depth mining of genome-wide genetic variation data. Astonishing discoveries were made in the past years, uncovering genes and biological mechanisms underlying common and complex disorders. However, as massive amounts of diverse functional genomics data accrue, there is a pressing need for newer generations of pathway analysis methods that can utilize multiple layers of high-throughput genomics data. In this review, we provide an intellectual foundation of this powerful analytic strategy, as well as an update of the state-of-the-art in recent method developments. The goal of this review is threefold: (1) introduce the motivation and basic steps of pathway analysis for genome-wide genetic variation data; (2) review the merits and the shortcomings of classic and newly emerging integrative pathway analysis tools; and (3) discuss remaining challenges and future directions for further method developments.

Keywords: Gene-set enrichment analysis; Genome-wide association study; Multilocus association analysis; Pathway analysis; Set-based association analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Algorithms
  • Genetic Predisposition to Disease*
  • Genome-Wide Association Study*
  • Humans