Bayesian variable selection for detecting adaptive genomic differences among populations

Genetics. 2008 Mar;178(3):1817-29. doi: 10.1534/genetics.107.081281. Epub 2008 Feb 1.

Abstract

We extend an F(st)-based Bayesian hierarchical model, implemented via Markov chain Monte Carlo, for the detection of loci that might be subject to positive selection. This model divides the F(st)-influencing factors into locus-specific effects, population-specific effects, and effects that are specific for the locus in combination with the population. We introduce a Bayesian auxiliary variable for each locus effect to automatically select nonneutral locus effects. As a by-product, the efficiency of the original approach is improved by using a reparameterization of the model. The statistical power of the extended algorithm is assessed with simulated data sets from a Wright-Fisher model with migration. We find that the inclusion of model selection suggests a clear improvement in discrimination as measured by the area under the receiver operating characteristic (ROC) curve. Additionally, we illustrate and discuss the quality of the newly developed method on the basis of an allozyme data set of the fruit fly Drosophila melanogaster and a sequence data set of the wild tomato Solanum chilense. For data sets with small sample sizes, high mutation rates, and/or long sequences, however, methods based on nucleotide statistics should be preferred.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bayes Theorem
  • Computer Simulation
  • Databases, Genetic
  • Drosophila melanogaster / genetics
  • Genome / genetics*
  • Models, Genetic
  • Population Dynamics
  • ROC Curve
  • Selection, Genetic*
  • Solanum lycopersicum / genetics