Conservation patterns in different functional sequence categories of divergent Drosophila species

Genomics. 2006 Oct;88(4):431-42. doi: 10.1016/j.ygeno.2006.03.012. Epub 2006 May 11.

Abstract

We have explored the distributions of fully conserved ungapped blocks in genome-wide pair-wise alignments of recently completed species of Drosophila: D. melanogaster, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, and D. mojavensis. Based on these distributions we have found that nearly every functional sequence category possesses its own distinctive conservation pattern, sometimes independent of the overall sequence conservation level. In the coding and regulatory regions, the ungapped blocks were longer than in introns, UTRs, and nonfunctional sequences. At the same time, the blocks in the coding regions carried a 3N + 2 signature characteristic of synonymous substitutions in the third-codon position. Larger block sizes in transcription regulatory regions can be explained by the presence of conserved arrays of binding sites for transcription factors. We also have shown that the longest ungapped blocks, or "ultraconserved" sequences, are associated with specific gene groups, including those encoding ion channels and components of the cytoskeleton. We discuss how restraining conservation patterns may help in mapping functional sequence categories and improve genome annotation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Base Sequence
  • Conserved Sequence*
  • Drosophila / genetics*
  • Enhancer Elements, Genetic
  • Exons
  • Genome, Insect
  • Introns
  • MicroRNAs
  • Promoter Regions, Genetic
  • Regulatory Sequences, Nucleic Acid
  • Untranslated Regions

Substances

  • MicroRNAs
  • Untranslated Regions