Additional gene ontology structure for improved biological reasoning

Bioinformatics. 2006 Aug 15;22(16):2020-7. doi: 10.1093/bioinformatics/btl334. Epub 2006 Jun 20.

Abstract

Motivation: The Gene Ontology (GO) is a widely used terminology for gene product characterization in, for example, interpretation of biology underlying microarray experiments. The current GO defines term relationships within each of the independent subontologies: molecular function, biological process and cellular component. However, it is evident that there also exist biological relationships between terms of different subontologies. Our aim was to connect the three subontologies to enable GO to cover more biological knowledge, enable a more consistent use of GO and provide new opportunities for biological reasoning.

Results: We propose a new structure, the Second Gene Ontology Layer, capturing biological relations not directly reflected in the present ontology structure. Given molecular functions, these paths identify biological processes where the molecular functions are involved and cellular components where they are active. The current Second Layer contains 6271 validated paths, covering 54% of the molecular functions of GO and can be used to render existing gene annotation sets more complete and consistent. Applying Second Layer paths to a set of 4223 human genes, increased biological process annotations by 24% compared to publicly available annotations and reproduced 30% of them.

Availability: The Second GO is publicly available through the GO Annotation Toolbox (GOAT.no): http://www.goat.no.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Database Management Systems
  • Genes
  • Genome
  • Genomics / methods*
  • Humans
  • Models, Biological
  • Models, Genetic
  • Models, Theoretical
  • Software
  • Terminology as Topic
  • Vocabulary, Controlled