TOBMI: trans-omics block missing data imputation using a k-nearest neighbor weighted approach

Bioinformatics. 2019 Apr 15;35(8):1278-1283. doi: 10.1093/bioinformatics/bty796.

Abstract

Motivation: Stitching together trans-omics data is a powerful approach to assess the complex mechanisms of cancer occurrence, progression and treatment. However, the integration process suffers from the 'block missing' phenomena when part of individuals lacks some omics data.

Results: We proposed a k-nearest neighbor (kNN) weighted imputation method for trans-omics block missing data (TOBMIkNN) to handle gene-absence individuals in RNA-seq datasets using external information obtained from DNA methylation probe datasets. Referencing to multi-hot deck, mean imputation and missing cases deletion, we assess the relative error, absolute error, inter-omics correlation structure change and variable selection.The proposed method, TOBMIkNN reliably imputed RNA-seq data by borrowing information from DNA methylation data, and showed superiority over the other three methods in imputation error and stability of correlation structure. Our study indicates that TOBMIkNN can be used as an advisable method for trans-omics block missing data imputation.

Availability and implementation: TOBMIkNN is freely available at https://github.com/XuesiDong/TOBMI.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis*
  • Humans
  • Research Design