Evaluation of a gene information summarization system by users during the analysis process of microarray datasets

BMC Bioinformatics. 2009 Feb 5;10 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-10-S2-S5.

Abstract

Background: Summarization of gene information in the literature has the potential to help genomics researchers translate basic research into clinical benefits. Gene expression microarrays have been used to study biomarkers for disease and discover novel types of therapeutics and the task of finding information in journal articles on sets of genes is common for translational researchers working with microarray data. However, manually searching and scanning the literature references returned from PubMed is a time-consuming task for scientists. We built and evaluated an automatic summarizer of information on genes studied in microarray experiments. The Gene Information Clustering and Summarization System (GICSS) is a system that integrates two related steps of the microarray data analysis process: functional gene clustering and gene information gathering. The system evaluation was conducted during the process of genomic researchers analyzing their own experimental microarray datasets.

Results: The clusters generated by GICSS were validated by scientists during their microarray analysis process. In addition, presenting sentences in the abstract provided significantly more important information to the users than just showing the title in the default PubMed format.

Conclusion: The evaluation results suggest that GICSS can be useful for researchers in genomic area. In addition, the hybrid evaluation method, partway between intrinsic and extrinsic system evaluation, may enable researchers to gauge the true usefulness of the tool for the scientists in their natural analysis workflow and also elicit suggestions for future enhancements.

Availability: GICSS can be accessed online at: http://ir.ohsu.edu/jianji/index.html.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Databases, Factual
  • Gene Expression Profiling / methods*
  • Genes
  • Genomics / methods
  • Information Systems
  • Oligonucleotide Array Sequence Analysis / methods*
  • PubMed
  • User-Computer Interface