Learning complex subcellular distribution patterns of proteins via analysis of immunohistochemistry images

Bioinformatics. 2020 Mar 1;36(6):1908-1914. doi: 10.1093/bioinformatics/btz844.

Abstract

Motivation: Systematic and comprehensive analysis of protein subcellular location as a critical part of proteomics ('location proteomics') has been studied for many years, but annotating protein subcellular locations and understanding variation of the location patterns across various cell types and states is still challenging.

Results: In this work, we used immunohistochemistry images from the Human Protein Atlas as the source of subcellular location information, and built classification models for the complex protein spatial distribution in normal and cancerous tissues. The models can automatically estimate the fractions of protein in different subcellular locations, and can help to quantify the changes of protein distribution from normal to cancer tissues. In addition, we examined the extent to which different annotated protein pathways and complexes showed similarity in the locations of their member proteins, and then predicted new potential proteins for these networks.

Availability and implementation: The dataset and code are available at: www.csbio.sjtu.edu.cn/bioinf/complexsubcellularpatterns.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Immunohistochemistry
  • Neoplasms*
  • Proteins*
  • Proteomics
  • Subcellular Fractions

Substances

  • Proteins