A data driven approach reveals disease similarity on a molecular level

NPJ Syst Biol Appl. 2019 Oct 25:5:39. doi: 10.1038/s41540-019-0117-0. eCollection 2019.

Abstract

Could there be unexpected similarities between different studies, diseases, or treatments, on a molecular level due to common biological mechanisms involved? To answer this question, we develop a method for computing similarities between empirical, statistical distributions of high-dimensional, low-sample datasets, and apply it on hundreds of -omics studies. The similarities lead to dataset-to-dataset networks visualizing the landscape of a large portion of biological data. Potentially interesting similarities connecting studies of different diseases are assembled in a disease-to-disease network. Exploring it, we discover numerous non-trivial connections between Alzheimer's disease and schizophrenia, asthma and psoriasis, or liver cancer and obesity, to name a few. We then present a method that identifies the molecular quantities and pathways that contribute the most to the identified similarities and could point to novel drug targets or provide biological insights. The proposed method acts as a "statistical telescope" providing a global view of the constellation of biological data; readers can peek through it at: http://datascope.csd.uoc.gr:25000/.

Keywords: Computer science; Information theory.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Data Analysis
  • Databases, Factual
  • Databases, Genetic
  • Disease / genetics
  • Epidemiologic Methods*
  • Epidemiology
  • Humans
  • Models, Statistical
  • Systems Analysis