The language of the protein universe

Curr Opin Genet Dev. 2015 Dec:35:50-6. doi: 10.1016/j.gde.2015.08.010. Epub 2015 Nov 3.

Abstract

Proteins, the main cell machinery which play a major role in nearly every cellular process, have always been a central focus in biology. We live in the post-genomic era, and inferring information from massive data sets is a steadily growing universal challenge. The increasing availability of fully sequenced genomes can be regarded as the 'Rosetta Stone' of the protein universe, allowing the understanding of genomes and their evolution, just as the original Rosetta Stone allowed Champollion to decipher the ancient Egyptian hieroglyphics. In this review, we consider aspects of the protein domain architectures repertoire that are closely related to those of human languages and aim to provide some insights about the language of proteins.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Amino Acid Sequence*
  • Computational Biology
  • Evolution, Molecular
  • Genome / genetics*
  • Humans
  • Language*
  • Protein Structure, Tertiary*
  • Proteins / metabolism*

Substances

  • Proteins