Comparison of linear weighting schemes for perfect match and mismatch gene expression levels from microarray data

Am J Pharmacogenomics. 2005;5(3):197-205. doi: 10.2165/00129785-200505030-00006.

Abstract

Background: Data analytic approaches to Affymetrix microarray data include: (a) a covariate model, in which the observed signal is some estimated linear function of perfect match (PM) and mismatch (MM) signals; (b) a difference model [PM-MM]; and (c) a PM-only model, in which MM data is not utilized.

Methods: By decomposing the correlations among the variables in the statistical model and making certain assumptions, we theoretically derive the statistical model that reflects the actual gene expression level under a variety of conditions expected in microarray data.

Results and conclusion: When modeling non-systematic variation, the covariate model provides maximum flexibility and often reflects the actual gene expression levels better than the difference model. However, the PM-only model demonstrates superior power in an overwhelming majority of realistic situations, which provides theoretical support for the current trend to employ PM-only models in microarray data analyzes.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Analysis of Variance
  • Data Interpretation, Statistical
  • Gene Expression Profiling / statistics & numerical data*
  • Linear Models
  • Models, Statistical
  • Molecular Probe Techniques
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Pharmacogenetics
  • RNA, Messenger / genetics

Substances

  • RNA, Messenger