Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated

Mol Biol Evol. 1989 May;6(3):270-89. doi: 10.1093/oxfordjournals.molbev.a040550.

Abstract

It has long been known, from the distribution of multiple amino acid replacements, that not all amino acids of a sequence are replaceable. More recently, the phenomenon was observed at the nucleotide level in mitochondrial DNA even after allowing for different rates of transition and transversion substitutions. We have extended the search to globin gene sequences from various organisms, with the following results: (1) Nearly every data set showed evidence of invariable nucleotide positions. (2) In all data sets, substitution rates of transversions and transitions were never in the ratio of 2/1, and rarely was the ratio even constant. (3) Only rarely (e.g., the third codon position of beta hemoglobins) was it possible to fit the data set solely by making allowance for the number of invariable positions and for the relative rates of transversion and transition substitutions. (4) For one data set (the second codon position of beta hemoglobins) we were able to simulate the observed data by making the allowance in (3) and having the set of covariotides (concomitantly variable nucleotides) be small in number and be turned over in a stochastic manner with a probability that was appreciable. (5) The fit in the latter case suggests, if the assumptions are correct and at all common, that current procedures for estimating the total number of nucleotide substitutions in two genes since their divergence from their common ancestor could be low by as much as an order of magnitude. (6) The fact that only a small fraction of the nucleotide positions differ is no guarantee that one is not seriously underestimating the total amount of divergence (substitutions). (7) Most data sets are so heterogeneous in their number of transition and transversion differences that none of the current models of nucleotide substitution seem to fit them even after (a) segregation of coding from noncoding sequences and (b) splitting of the codon into three subsets by codon position. (8) These frequently occurring problems cannot be seen unless several reasonably divergent orthologous genes are examined together.

MeSH terms

  • Base Sequence
  • DNA / genetics
  • Female
  • Genes
  • Genetic Variation*
  • Globins / genetics*
  • Male
  • Sequence Homology, Nucleic Acid

Substances

  • Globins
  • DNA