Identification of genes for complex diseases by integrating multiple types of genomic data

Annu Int Conf IEEE Eng Med Biol Soc. 2012:2012:5541-4. doi: 10.1109/EMBC.2012.6347249.

Abstract

Combining multi-types of genomic data for integrative analyses can take advantage of complementary information and thus can have higher power to identify genes/variables that would otherwise be impossible with individual data analysis. Here we proposed a sparse representation based clustering (SRC) method for integrative data analyses, and applied the SRC method to the integrative analysis of 376821 SNPs in 200 subjects (100 cases and 100 controls) and expression data for 22283 genes in 80 subjects (40 cases and 40 controls) to identify significant genes for osteoporosis (OP). Comparing our results with previous studies, we identified some genes known related to OP risk, as well as some uncovered novel osteoporosis susceptible genes ('DICER1', 'PTMA', etc.) that may function importantly in osteoporosis etiology. In addition, the SRC method identified genes can lead to higher accuracy for the identification of osteoporosis subjects when compared with the traditional T-test and Fisher-exact test, which further validate the proposed SRC approach for integrative analysis.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Algorithms
  • Genetic Predisposition to Disease*
  • Genomics*
  • Humans
  • Polymorphism, Single Nucleotide