Evaluating methods for ranking differentially expressed genes applied to microArray quality control data

BMC Bioinformatics. 2011 Jun 6:12:227. doi: 10.1186/1471-2105-12-227.

Abstract

Background: Statistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility. In our previous studies, we evaluated eight gene ranking methods applied to only Affymetrix GeneChip data. A more general evaluation that also includes other microarray platforms, such as the Agilent or Illumina systems, is desirable for determining which methods are suitable for each platform and which method has better inter-platform reproducibility.

Results: We compared the eight gene ranking methods using the MicroArray Quality Control (MAQC) datasets produced by five manufacturers: Affymetrix, Applied Biosystems, Agilent, GE Healthcare, and Illumina. The area under the curve (AUC) was used as a measure for both sensitivity and specificity. Although the highest AUC values can vary with the definition of "true" DEGs, the best methods were, in most cases, either the weighted average difference (WAD), rank products (RP), or intensity-based moderated t statistic (ibmT). The percentages of overlapping genes (POGs) across different test sites were mainly evaluated as a measure for both intra- and inter-platform reproducibility. The POG values for WAD were the highest overall, irrespective of the choice of microarray platform. The high intra- and inter-platform reproducibility of WAD was also observed at a higher biological function level.

Conclusion: These results for the five microarray platforms were consistent with our previous ones based on 36 real experimental datasets measured using the Affymetrix platform. Thus, recommendations made using the MAQC benchmark data might be universally applicable.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biometry / methods
  • Gene Expression Profiling / methods*
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis / methods*
  • Quality Control
  • Research Design
  • Sensitivity and Specificity