Domain rearrangements in protein evolution

J Mol Biol. 2005 Nov 4;353(4):911-23. doi: 10.1016/j.jmb.2005.08.067. Epub 2005 Sep 21.

Abstract

Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Eukaryotic Cells
  • Evolution, Molecular*
  • Guanine Nucleotide Exchange Factors / chemistry
  • Guanine Nucleotide Exchange Factors / genetics
  • Guanine Nucleotide Exchange Factors / metabolism*
  • Phylogeny
  • Protein Structure, Tertiary
  • Protein-Tyrosine Kinases / chemistry
  • Protein-Tyrosine Kinases / genetics
  • Protein-Tyrosine Kinases / metabolism*
  • Proteome*
  • Sequence Analysis

Substances

  • Guanine Nucleotide Exchange Factors
  • Proteome
  • Protein-Tyrosine Kinases