Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds

PLoS One. 2017 Mar 21;12(3):e0173954. doi: 10.1371/journal.pone.0173954. eCollection 2017.

Abstract

Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

MeSH terms

  • Animals
  • Brazil
  • Breeding
  • Cattle / classification
  • Cattle / genetics*
  • Female
  • Genotype
  • High-Throughput Nucleotide Sequencing / veterinary
  • INDEL Mutation*
  • Male
  • Molecular Sequence Annotation
  • Oligonucleotide Array Sequence Analysis / veterinary
  • Polymorphism, Single Nucleotide*
  • Sequence Analysis, DNA / veterinary
  • Species Specificity

Grants and funding

MVGBS was supported by Embrapa (Brazil) SEG 02.09.07.008.00.00 “Genomic Selection in Dairy Cattle in Brazil”, Embrapa (Brazil) SEG 01.11.07.002.00.00 “National network for development of innovative genomic strategies applied to animal breeding, conservation and production” (PI – ARC), CNPq PVE 407246/2013-4 “Genomic Selection in Dairy Gyr and Girolando Breeds”, and FAPEMIG CVZ PPM 00395/14 “Genomic Selection in Brazilian Dairy Breeds” appropriated projects. NBS was supported by postdoctoral fellowship from Coordination for the Improvement of Higher Education Personnel (CAPES/PNPD). TCSC received a fellowship from São Paulo Research Foundation (FAPESP – 15/08939-0). ARC, MAM, DPM, MFM, MRC, and MVGBS were supported by productivity research fellowship from National Counsel of Technological and Scientific Development (CNPq). We would like to thank the EMBRAPA Multiuser Bioinformatics Laboratory (Laboratório Multiusuário de Bioinformática da Embrapa) for providing high-performance computational infrastructure.