Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes

Mol Biol Evol. 2002 Jan;19(1):49-57. doi: 10.1093/oxfordjournals.molbev.a003981.

Abstract

The nonsynonymous to synonymous substitution rate ratio (omega = d(N)/d(S)) provides a sensitive measure of selective pressure at the protein level, with omega values <1, =1, and >1 indicating purifying selection, neutral evolution, and diversifying selection, respectively. Maximum likelihood models of codon substitution developed recently account for variable selective pressures among amino acid sites by employing a statistical distribution for the omega ratio among sites. Those models, called random-sites models, are suitable when we do not know a priori which sites are under what kind of selective pressure. Sometimes prior information (such as the tertiary structure of the protein) might be available to partition sites in the protein into different classes, which are expected to be under different selective pressures. It is then sensible to use such information in the model. In this paper, we implement maximum likelihood models for prepartitioned data sets, which account for the heterogeneity among site partitions by using different omega parameters for the partitions. The models, referred to as fixed-sites models, are also useful for combined analysis of multiple genes from the same set of species. We apply the models to data sets of the major histocompatibility complex (MHC) class I alleles from human populations and of the abalone sperm lysin genes. Structural information is used to partition sites in MHC into two classes: those in the antigen recognition site (ARS) and those outside. Positive selection is detected in the ARS by the fixed-sites models. Similarly, sites in lysin are classified into the buried and solvent-exposed classes according to the tertiary structure, and positive selection was detected at the solvent-exposed sites. The random-sites models identified a number of sites under positive selection in each data set, confirming and elaborating the results of the fixed-sites models. The analysis demonstrates the utility of the fixed-sites models, as well as the power of previous random-sites models, which do not use the prior information to partition sites.

MeSH terms

  • Alleles
  • Amino Acid Substitution / genetics*
  • Animals
  • Codon / genetics*
  • Evolution, Molecular*
  • Histocompatibility Antigens Class I / chemistry
  • Histocompatibility Antigens Class I / genetics
  • Humans
  • Hydrophobic and Hydrophilic Interactions
  • Likelihood Functions
  • Male
  • Models, Genetic*
  • Models, Molecular
  • Mollusca / genetics
  • Mucoproteins / genetics
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / classification
  • Proteins / genetics*
  • Selection, Genetic
  • Solvents
  • Spermatozoa / metabolism

Substances

  • Codon
  • Histocompatibility Antigens Class I
  • Mucoproteins
  • Proteins
  • Solvents
  • lysin, gastropoda