A deep catalog of autosomal single nucleotide variation in the pig

PLoS One. 2015 Mar 19;10(3):e0118867. doi: 10.1371/journal.pone.0118867. eCollection 2015.

Abstract

A comprehensive catalog of variability in a given species is useful for many important purposes, e.g., designing high density arrays or pinpointing potential mutations of economic or physiological interest. Here we provide a genomewide, worldwide catalog of single nucleotide variants by simultaneously analyzing the shotgun sequence of 128 pigs and five suid outgroups. Despite the high SNP missing rate of some individuals (up to 88%), we retrieved over 48 million high quality variants. Of them, we were able to assess the ancestral allele of more than 39M biallelic SNPs. We found SNPs in 21,455 out of the 25,322 annotated genes in pig assembly 10.2. The annotation showed that more than 40% of the variants were novel variants, not present in dbSNP. Surprisingly, we found a large variability in transition / transversion rate along the genome, which is very well explained (R2=0.79) primarily by genome differences in in CpG content and recombination rate. The number of SNPs per window also varied but was less dependent of known factors such as gene density, missing rate or recombination (R2=0.48). When we divided the samples in four groups, Asian wild boar (ASWB), Asian domestics (ASDM), European wild boar (EUWB) and European domestics (EUDM), we found a marked correlation in allele frequencies between domestics and wild boars within Asia and within Europe, but not across continents, due to the large evolutive distance between pigs of both continents (~1.2 MYA). In general, the porcine species showed a small percentage of SNPs exclusive of each population group. EUWB and EUDM were predicted to harbor a larger fraction of potentially deleterious mutations, according to the SIFT algorithm, than Asian samples, perhaps a result of background selection being less effective due to a lower effective population size in Europe.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Europe
  • Gene Frequency
  • Gene Library*
  • Genetic Variation / genetics*
  • Genetics, Population
  • Genome / genetics*
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide / genetics*
  • Population Density
  • Species Specificity
  • Sus scrofa / genetics*

Grants and funding

This work was funded by AGL2010-14822 and AGL2013-41834-R (Ministry of Economy and Science, Spain) to MPE and SERO. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.