omicsNPC: Applying the Non-Parametric Combination Methodology to the Integrative Analysis of Heterogeneous Omics Data

PLoS One. 2016 Nov 3;11(11):e0165545. doi: 10.1371/journal.pone.0165545. eCollection 2016.

Abstract

Background: The advance of omics technologies has made possible to measure several data modalities on a system of interest. In this work, we illustrate how the Non-Parametric Combination methodology, namely NPC, can be used for simultaneously assessing the association of different molecular quantities with an outcome of interest. We argue that NPC methods have several potential applications in integrating heterogeneous omics technologies, as for example identifying genes whose methylation and transcriptional levels are jointly deregulated, or finding proteins whose abundance shows the same trends of the expression of their encoding genes.

Results: We implemented the NPC methodology within "omicsNPC", an R function specifically tailored for the characteristics of omics data. We compare omicsNPC against a range of alternative methods on simulated as well as on real data. Comparisons on simulated data point out that omicsNPC produces unbiased / calibrated p-values and performs equally or significantly better than the other methods included in the study; furthermore, the analysis of real data show that omicsNPC (a) exhibits higher statistical power than other methods, (b) it is easily applicable in a number of different scenarios, and (c) its results have improved biological interpretability.

Conclusions: The omicsNPC function competitively behaves in all comparisons conducted in this study. Taking into account that the method (i) requires minimal assumptions, (ii) it can be used on different studies designs and (iii) it captures the dependences among heterogeneous data modalities, omicsNPC provides a flexible and statistically powerful solution for the integrative analysis of different omics data.

MeSH terms

  • Breast Neoplasms / genetics
  • Breast Neoplasms / pathology
  • Gene Expression Profiling*
  • Genomics*
  • Glioblastoma / genetics
  • Humans
  • Neoplasm Invasiveness
  • Schizophrenia / genetics
  • Statistics as Topic / methods*
  • Statistics, Nonparametric*

Grants and funding

NK and VL were partially funded by the STATegra EU FP7 project, No 306000. IT was partially funded by the EPILOGEAS GSRT ARISTEIA II project, No 3446. IT and VL were also partially funded by the European Research Council (ERC) project CAUSALPATH—Next Generation Causal Analysis, No 617393.