Handling multiple testing while interpreting microarrays with the Gene Ontology Database

BMC Bioinformatics. 2004 Sep 6:5:124. doi: 10.1186/1471-2105-5-124.

Abstract

Background: The development of software tools that analyze microarray data in the context of genetic knowledgebases is being pursued by multiple research groups using different methods. A common problem for many of these tools is how to correct for multiple statistical testing since simple corrections are overly conservative and more sophisticated corrections are currently impractical. A careful study of the nature of the distribution one would expect by chance, such as by a simulation study, may be able to guide the development of an appropriate correction that is not overly time consuming computationally.

Results: We present the results from a preliminary study of the distribution one would expect for analyzing sets of genes extracted from Drosophila, S. cerevisiae, Wormbase, and Gramene databases using the Gene Ontology Database.

Conclusions: We found that the estimated distribution is not regular and is not predictable outside of a particular set of genes. Permutation-based simulations may be necessary to determine the confidence in results of such analyses.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Computational Biology / statistics & numerical data
  • Data Interpretation, Statistical
  • Databases, Genetic / statistics & numerical data*
  • Drosophila / genetics
  • Gene Expression Profiling / statistics & numerical data*
  • Helminths / genetics
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Saccharomyces cerevisiae / genetics
  • Software / statistics & numerical data