Whole-genome duplications in the ancestral vertebrate are detectable in the distribution of gene family sizes of tetrapod species

J Mol Evol. 2008 Oct;67(4):343-57. doi: 10.1007/s00239-008-9145-x. Epub 2008 Sep 25.

Abstract

A clustering of all protein coding genes from the complete genomes of five tetrapod species into gene families shows a clear deviation from the expected power-law distribution of gene family size. We hypothesize that at least part of the deviation is the result of the two whole-genome duplications (WGDs) that are now known, with reasonable certainty, to have occurred prior to the fish-tetrapod split. We build a model of homologous gene family evolution and perform simulations to show that speciations alone cannot produce a distribution that resembles the empirical data. In order to replicate the features of the empirical distribution, the simulation must incorporate two WGD events. These WGDs must be such that a significant number of the gene duplicates generated in the WGDs have a higher retention rate than they do following small-scale duplication (SSD). This requirement is consistent with what is known about duplicate retention following a WGD, namely, that genes belonging to specific functional classes, such as genes regulating transcription, are much more likely to be retained following WGD than SSD. We conclude that the deviation from the power-law that we observe in the empirical data is the result of the two WGDs that occurred in the ancestral chordate. This implies that the two ancient WGDs continue to have a structural effect on gene families approximately 500 million years after the initial events. On the one hand, this is a surprising result, given the limited retention of duplicates generated by a WGD and the continual SSD, which further weakens the signal created by the fraction of duplicate pairs that are retained. On the other hand, WGD's capacity to fundamentally change the architecture of gene families in a profound and lasting way is consistent with the observed correlation between WGDs and important evolutionary transitions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Gene Duplication*
  • Genetic Speciation
  • Genome / genetics*
  • Humans
  • Multigene Family / genetics*
  • Phylogeny*
  • Vertebrates / genetics*