Fast gene ontology based clustering for microarray experiments

BioData Min. 2008 Nov 21;1(1):11. doi: 10.1186/1756-0381-1-11.

Abstract

Background: Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses.

Results: We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster.

Conclusion: Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.