Estimating support for protein-protein interaction data with applications to function prediction

Erliang Zeng; Chris Ding; Giri Narasimhan; Stephen R Holbrook

Estimating support for protein-protein interaction data with applications to function prediction

Comput Syst Bioinformatics Conf. 2008:7:73-84.

Authors

Erliang Zeng¹, Chris Ding, Giri Narasimhan, Stephen R Holbrook

Affiliation

¹ Bioinformatics Research Group, School of Computing and Information Sciences, Florida International University, Miami, FL 33199, USA. ezeng001@cs.fiu.edu

PMID: 19642270

Abstract

Almost every cellular process requires the interactions of pairs or larger complexes of proteins. High throughput protein-protein interaction (PPI) data have been generated using techniques such as the yeast two-hybrid systems, mass spectrometry method, and many more. Such data provide us with a new perspective to predict protein functions and to generate protein-protein interaction networks, and many recent algorithms have been developed for this purpose. However, PPI data generated using high throughput techniques contain a large number of false positives. In this paper, we have proposed a novel method to evaluate the support for PPI data based on gene ontology information. If the semantic similarity between genes is computed using gene ontology information and using Resnik's formula, then our results show that we can model the PPI data as a mixture model predicated on the assumption that true protein-protein interactions will have higher support than the false positives in the data. Thus semantic similarity between genes serves as a metric of support for PPI data. Taking it one step further, new function prediction approaches are also being proposed with the help of the proposed metric of the support for the PPI data. These new function prediction approaches outperform their conventional counterparts. New evaluation methods are also proposed.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Computer Simulation
Gene Expression Profiling / methods*
Models, Biological*
Protein Interaction Mapping / methods*
Proteome / metabolism*
Signal Transduction / physiology*

Substances

Proteome

Abstract

Publication types

MeSH terms

Substances

Grants and funding