Link prediction in protein-protein interaction network: A similarity multiplied similarity algorithm with paths of length three

J Theor Biol. 2024 Jul 21:589:111850. doi: 10.1016/j.jtbi.2024.111850. Epub 2024 May 11.

Abstract

Protein-protein interactions (PPIs) are crucial for various biological processes, and predicting PPIs is a major challenge. To solve this issue, the most common method is link prediction. Currently, the link prediction methods based on network Paths of Length Three (L3) have been proven to be highly effective. In this paper, we propose a novel link prediction algorithm, named SMS, which is based on L3 and protein similarities. We first design a mixed similarity that combines the topological structure and attribute features of nodes. Then, we compute the predicted value by summing the product of all similarities along the L3. Furthermore, we propose the Max Similarity Multiplied Similarity (maxSMS) algorithm from the perspective of maximum impact. Our computational prediction results show that on six datasets, including S. cerevisiae, H. sapiens, and others, the maxSMS algorithm improves the precision of the top 500, area under the precision-recall curve, and normalized discounted cumulative gain by an average of 26.99%, 53.67%, and 6.7%, respectively, compared to other optimal methods.

Keywords: Link prediction; Paths of length three; Protein similarity; Protein–protein interaction.

MeSH terms

  • Algorithms*
  • Computational Biology / methods
  • Databases, Protein
  • Humans
  • Protein Interaction Mapping* / methods
  • Protein Interaction Maps*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism
  • Saccharomyces cerevisiae Proteins / genetics
  • Saccharomyces cerevisiae Proteins / metabolism

Substances

  • Saccharomyces cerevisiae Proteins