Statistical methods for integrating multiple types of high-throughput data

Methods Mol Biol. 2010:620:511-29. doi: 10.1007/978-1-60761-580-4_19.

Abstract

Large-scale sequencing, copy number, mRNA, and protein data have given great promise to the biomedical research, while posing great challenges to data management and data analysis. Integrating different types of high-throughput data from diverse sources can increase the statistical power of data analysis and provide deeper biological understanding. This chapter uses two biomedical research examples to illustrate why there is an urgent need to develop reliable and robust methods for integrating the heterogeneous data. We then introduce and review some recently developed statistical methods for integrative analysis for both statistical inference and classification purposes. Finally, we present some useful public access databases and program code to facilitate the integrative analysis in practice.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Biostatistics / methods*
  • Classification
  • Data Interpretation, Statistical
  • Databases, Genetic
  • Gene Expression Regulation
  • Humans
  • Neoplasms / diagnosis