Analysis of SNP-expression association matrices

J Bioinform Comput Biol. 2006 Apr;4(2):259-74. doi: 10.1142/s0219720006001953.

Abstract

High throughput expression profiling and genotyping technologies provide the means to study the genetic determinants of population variation in gene expression variation. In this paper we present a general statistical framework for the simultaneous analysis of gene expression data and SNP genotype data measured for the same cohort. The framework consists of methods to associate transcripts with SNPs affecting their expression, algorithms to detect subsets of transcripts that share significantly many associations with a subset of SNPs, and methods to visualize the identified relations. We apply our framework to SNP-expression data collected from 50 breast cancer patients. Our results demonstrate an overabundance of transcript-SNP associations in this data, and pinpoint SNPs that are potential master regulators of transcription. We also identify several statistically significant transcript-subsets with common putative regulators that fall into well-defined functional categories.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Artificial Intelligence
  • Base Sequence
  • Breast Neoplasms / genetics*
  • Expressed Sequence Tags
  • Gene Expression / genetics
  • Genetic Predisposition to Disease / genetics
  • Humans
  • Molecular Sequence Data
  • Neoplasm Proteins / genetics*
  • Pattern Recognition, Automated / methods
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Transcription Factors / genetics*

Substances

  • Neoplasm Proteins
  • Transcription Factors