Exhaustive Variant Interaction Analysis using Multifactor Dimensionality Reduction

Res Sq [Preprint]. 2023 Oct 16:rs.3.rs-3401025. doi: 10.21203/rs.3.rs-3401025/v1.

Abstract

One of the main goals of human genetics is to understand the connections between genomic variation and the predisposition to develop a complex disorder. These disease-variant associations are usually studied in a single independent manner, disregarding the possible effect derived from the interaction between genomic variants. In particular, in a background of complex diseases, these interactions can be directly linked to the disorder and may play an important role in disease development. Although their study has been suggested to help to complete the understanding of the genetic bases of complex diseases, this still represents a big challenge due to large computing demands. Here, we have taken advantage of High-Performance Computing technologies to tackle this problem using a combination of machine learning methods and statistical approaches. As a result, we have created a containerized framework that uses Multifactor Dimensionality Reduction to detect pairs of variants associated with Type 2 Diabetes (T2D). This methodology has been tested in the Northwestern University NUgene project cohort using a dataset of 1,883,192 variant pairs with a certain degree of association with T2D. Out of the pairs studied, we have identified 104 significant pairs, two of which exhibit a potential functional relationship with T2D.

Publication types

  • Preprint