Analysis of genomic G + C content, codon usage, initiator codon context and translation termination sites in Tetrahymena thermophila

J Eukaryot Microbiol. 1999 May-Jun;46(3):239-47. doi: 10.1111/j.1550-7408.1999.tb05120.x.

Abstract

In recent years, the amount of molecular sequencing data from Tetrahymena thermophila has dramatically increased. We analyzed G + C content, codon usage, initiator codon context and stop codon sites in the extremely A + T rich genome of this ciliate. Average G + C content was 38% for protein coding regions, 21% for 5' non-coding sequences, 19% for 3' non-coding sequences, 15% for introns, 19% for micronuclear limited sequences and 17% for macronuclear retained sequences flanking micronuclear specific regions. The 75 available T. thermophila protein coding sequences favored codons ending in T and, where possible, avoided those with G in the third position. Highly expressed genes were relatively G + C-rich and exhibited an extremely biased pattern of codon usage while developmentally regulated genes were more A + T-rich and showed less codon usage bias. Regions immediately preceding Tetrahymena translation initiator codons were generally A-rich. For the 60 stop codons examined, the frequency of G in the end + 1 site was much higher than expected whereas C never occupied this position.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Base Composition / genetics*
  • Codon / genetics*
  • Codon, Initiator / genetics
  • Codon, Terminator / genetics
  • Genes, Protozoan
  • Protein Biosynthesis*
  • Tetrahymena thermophila / genetics*

Substances

  • Codon
  • Codon, Initiator
  • Codon, Terminator