Locating large-scale gene duplication events through reconciled trees: implications for identifying ancient polyploidy events in plants

J Comput Biol. 2009 Aug;16(8):1071-83. doi: 10.1089/cmb.2009.0139.

Abstract

Recent analyses of plant genomic data have found extensive evidence of ancient whole genome duplication (or polyploidy) events, but there are many unresolved questions regarding the number and timing of such events in plant evolutionary history. We describe the first exact and efficient algorithm for the Episode Clustering problem, which, given a collection of rooted gene trees and a rooted species tree, seeks the minimum number of locations on the species tree of gene duplication events. Solving this problem allows one to place gene duplication events onto nodes of a given species tree and potentially detect large-scale gene duplication events. We examined the performance of an implementation of our algorithm using 85 plant gene trees that contain genes from a total of 136 plant taxa. We found evidence of large-scale gene duplication events in Populus, Gossypium, Poaceae, Asteraceae, Brassicaceae, Solanaceae, Fabaceae, and near the root of the eudicot clade that are consistent with previous genomic evidence. However, a lack of phylogenetic signal within the gene trees can produce erroneous evidence of large-scale duplication events, especially near the root of the species tree. Although the results of our algorithm should be interpreted cautiously, they provide hypotheses for precise locations of large-scale gene duplication events with data from relatively few gene trees and can complement other genomic approaches to provide a more comprehensive view of ancient large-scale gene duplication events.

MeSH terms

  • Algorithms*
  • Evolution, Molecular*
  • Gene Duplication*
  • Genome, Plant
  • Genomics / methods*
  • Plants / genetics*
  • Polyploidy*