Accurate local-ancestry inference in exome-sequenced admixed individuals via off-target sequence reads

Am J Hum Genet. 2013 Nov 7;93(5):891-9. doi: 10.1016/j.ajhg.2013.10.008.

Abstract

Estimates of the ancestry of specific chromosomal regions in admixed individuals are useful for studies of human evolutionary history and for genetic association studies. Previously, this ancestry inference relied on high-quality genotypes from genome-wide association study (GWAS) arrays. These high-quality genotypes are not always available when samples are exome sequenced, and exome sequencing is the strategy of choice for many ongoing genetic studies. Here we show that off-target reads generated during exome-sequencing experiments can be combined with on-target reads to accurately estimate the ancestry of each chromosomal segment in an admixed individual. To reconstruct local ancestry, our method SEQMIX models aligned bases directly instead of relying on hard genotype calls. We evaluate the accuracy of our method through simulations and analysis of samples sequenced by the 1000 Genomes Project and the NHLBI Grand Opportunity Exome Sequencing Project. In African Americans, we show that local-ancestry estimates derived by our method are very similar to those derived with Illumina's Omni 2.5M genotyping array and much improved in relation to estimates that use only exome genotypes and ignore off-target sequencing reads. Software implementing this method, SEQMIX, can be applied to analysis of human population history or used for genetic association studies in admixed individuals.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Black or African American / genetics
  • Chromosome Mapping
  • Computer Simulation
  • Empirical Research
  • Exome*
  • Genetic Association Studies / methods*
  • Genetics, Population / methods*
  • Genome, Human
  • Genotype
  • Humans
  • Linkage Disequilibrium
  • Markov Chains
  • Models, Genetic
  • Oligonucleotide Array Sequence Analysis
  • Sequence Analysis, DNA / methods*
  • Software