Genomic predictions in Angus cattle: comparisons of sample size, response variables, and clustering methods for cross-validation

J Anim Sci. 2014 Feb;92(2):485-97. doi: 10.2527/jas.2013-6757. Epub 2014 Jan 15.

Abstract

Advances in genomics, molecular biology, and statistical genetics have created a paradigm shift in the way livestock producers pursue genetic improvement in their herds. The nexus of these technologies has resulted in combining genotypic and phenotypic information to compute genomically enhanced measures of genetic merit of individual animals. However, large numbers of genotyped and phenotyped animals are required to produce robust estimates of the effects of SNP that are summed together to generate direct genomic breeding values (DGV). Data on 11,756 Angus animals genotyped with the Illumina BovineSNP50 Beadchip were used to develop genomic predictions for 17 traits reported by the American Angus Association through Angus Genetics Inc. in their National Cattle Evaluation program. Marker effects were computed using a 5-fold cross-validation approach and a Bayesian model averaging algorithm. The accuracies were examined with EBV and deregressed EBV (DEBV) response variables and with K-means and identical by state (IBS)-based cross-validation methodologies. The cross-validation accuracies obtained using EBV response variables were consistently greater than those obtained using DEBV (average correlations were 0.64 vs. 0.57). The accuracies obtained using K-means cross-validation were consistently smaller than accuracies obtained with the IBS-based cross-validation approach (average correlations were 0.58 vs. 0.64 with EBV used as a response variable). Comparing the results from the current study with the results from a similar study consisting of only 2,253 records indicated that larger training population size resulted in higher accuracies in validation animals and explained on average 18% (69% improvement) additional genetic variance across all traits.

MeSH terms

  • Animals
  • Breeding
  • Cattle / genetics*
  • Cattle / physiology*
  • Cluster Analysis
  • Female
  • Genomics*
  • Male
  • Models, Biological
  • Reproducibility of Results
  • Sample Size
  • United States