Inferring a directed acyclic graph of phenotypes from GWAS summary statistics

bioRxiv [Preprint]. 2023 Nov 25:2023.02.10.528092. doi: 10.1101/2023.02.10.528092.

Abstract

Estimating phenotype networks is a growing field in computational biology. It deepens the understanding of disease etiology and is useful in many applications. In this study, we present a method that constructs a phenotype network by assuming a Gaussian linear structure model embedding a directed acyclic graph (DAG). We utilize genetic variants as instrumental variables and show how our method only requires access to summary statistics from a genome-wide association study (GWAS) and a reference panel of genotype data. Besides estimation, a distinct feature of the method is its summary statistics-based likelihood ratio test on directed edges. We applied our method to estimate a causal network of 29 cardiovascular-related proteins and linked the estimated network to Alzheimer's disease (AD). A simulation study was conducted to demonstrate the effectiveness of this method. An R package sumdag implementing the proposed method, all relevant code, and a Shiny application are available at https://github.com/chunlinli/sumdag.

Keywords: Alzheimer’s disease (AD); directed acyclic graph (DAG); genome-wide association study (GWAS); likelihood ratio test; proteomics.

Publication types

  • Preprint