Systematic Prediction of Orthologous Units of Genes in the Complete Genomes

Genome Inform Ser Workshop Genome Inform. 1998:9:32-40.

Abstract

In order to fully make use of the vast amount of information in the complete genome sequences, we are developing a genome-scale system for predicting gene functions and cellular functions. The system makes use of the information of sequence similarity, the information of positional correlations in the genome, and the reference knowledge stored as the ortholog group tables in KEGG (Kyoto Encyclopedia of Genes and Genomes). The ortholog group table summarizes orthologous and paralogous relations among different organisms for a set of genes that are considered to form a functional unit, such as a conserved portion of the metabolic pathway or a molecular machinery for the membrane transport. At the moment, the ortholog group table is constructed for the cases where the genes are clustered in physically close positions in the genome for at least one organism. In this paper, we describe the system and the actual analysis of the complete genome of Pyrococcus horikoshii to identify ABC transporters.