Sparse dimensionality reduction approaches in Mendelian randomisation with highly correlated exposures

Elife. 2023 Apr 19:12:e80063. doi: 10.7554/eLife.80063.

Abstract

Multivariable Mendelian randomisation (MVMR) is an instrumental variable technique that generalises the MR framework for multiple exposures. Framed as a regression problem, it is subject to the pitfall of multicollinearity. The bias and efficiency of MVMR estimates thus depends heavily on the correlation of exposures. Dimensionality reduction techniques such as principal component analysis (PCA) provide transformations of all the included variables that are effectively uncorrelated. We propose the use of sparse PCA (sPCA) algorithms that create principal components of subsets of the exposures with the aim of providing more interpretable and reliable MR estimates. The approach consists of three steps. We first apply a sparse dimension reduction method and transform the variant-exposure summary statistics to principal components. We then choose a subset of the principal components based on data-driven cutoffs, and estimate their strength as instruments with an adjusted F-statistic. Finally, we perform MR with these transformed exposures. This pipeline is demonstrated in a simulation study of highly correlated exposures and an applied example using summary data from a genome-wide association study of 97 highly correlated lipid metabolites. As a positive control, we tested the causal associations of the transformed exposures on coronary heart disease (CHD). Compared to the conventional inverse-variance weighted MVMR method and a weak instrument robust MVMR method (MR GRAPPLE), sparse component analysis achieved a superior balance of sparsity and biologically insightful grouping of the lipid traits.

Keywords: Mendelian randomisation; causal inference; coronary heart disease; dimensionality reduction; genetics; genomics; multivariable mendelian randomization; principal component analysis; sparsity.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Causality
  • Coronary Disease*
  • Genome-Wide Association Study*
  • Humans
  • Lipids
  • Mendelian Randomization Analysis / methods

Substances

  • Lipids