Gene Ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery

Brief Bioinform. 2017 Sep 1;18(5):886-901. doi: 10.1093/bib/bbw067.

Abstract

Gene Ontology (GO) semantic similarity tools enable retrieval of semantic similarity scores, which incorporate biological knowledge embedded in the GO structure for comparing or classifying different proteins or list of proteins based on their GO annotations. This facilitates a better understanding of biological phenomena underlying the corresponding experiment and enables the identification of processes pertinent to different biological conditions. Currently, about 14 tools are available, which may play an important role in improving protein analyses at the functional level using different GO semantic similarity measures. Here we survey these tools to provide a comprehensive view of the challenges and advances made in this area to avoid redundant effort in developing features that already exist, or implementing ideas already proven to be obsolete in the context of GO. This helps researchers, tool developers, as well as end users, understand the underlying semantic similarity measures implemented through knowledge of pertinent features of, and issues related to, a particular tool. This should empower users to make appropriate choices for their biological applications and ensure effective knowledge discovery based on GO annotations.

Keywords: Gene Ontology annotations; gene ontology; protein functional analysis; protein functional similarity; semantic similarity tools.

MeSH terms

  • Gene Ontology*
  • Humans
  • Molecular Sequence Annotation
  • Semantics
  • Surveys and Questionnaires