A comprehensive examination of protein sequences for evidence of internal gene duplication

J Mol Evol. 1978 Feb 21;10(4):265-81. doi: 10.1007/BF01734217.

Abstract

We have implemented a routine procedure for screening protein sequences for evidence of intragenic duplications. We tested 163 protein sequences representing 116 superfamilies of unrelated proteins. Twenty superfamilies contain proteins with internal gene duplications. The intragenic duplications detected can be divided into two major types. (1) One or more duplications of all or part of a gene produce a protein with two or several detectable regions of sequence homology. Sequences from 18 superfamilies contained this type of duplication. (2) Repeated reduplication of a small DNA segment can produce a protein that is repetitive over most of its length. Three superfamilies contain such repetitive sequences. We also investigated the limits of detection of ancient duplications using sequences derived by random mutation of a model sequence consisting of ten 10-residue repeats. The original repetitive nature of the sequence was usually detected after 250 point mutations even though the ancestral segment could not be accurately reconstructed.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Biological Evolution*
  • Computers
  • Genes*
  • Humans
  • Models, Biological
  • Proteins / genetics*

Substances

  • Proteins