Location: Genetics and Animal Breeding
Title: Using pooled data for genomic prediction in a bivariate framework with missing dataAuthor
BALLER, JOHNNA - University Of Nebraska | |
KACHMAN, STEPHEN - University Of Nebraska | |
Kuehn, Larry | |
SPANGLER, MATTHEW - University Of Nebraska |
Submitted to: Journal of Animal Breeding and Genetics
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 5/21/2022 Publication Date: 6/14/2022 Citation: Baller, J.L., Kachman, S.D., Kuehn, L.A., Spangler, M.L. 2022. Using pooled data for genomic prediction in a bivariate framework with missing data. Journal of Animal Breeding and Genetics. Article 12727. https://doi.org/10.1111/jbg.12727. DOI: https://doi.org/10.1111/jbg.12727 Interpretive Summary: Virtually all beef cattle genetic evaluation programs now utilized genotyping arrays to derive genomically enhanced genetic predictions. Adding genotypic information improves prediction accuracy and thus increases the rate of genetic programs. In previous work, we have shown that further accuracy gains can be obtained by utilizing commercial data that is not routinely recorded in seedstock populations; genotyping pooled DNA from commercial groups can achieve these gains at a significantly lower cost. This study evaluated whether DNA pooling over two different, correlated traits can further increase accuracy gains in genetic evaluations. Results suggested the DNA pooling can be effectively utilized in multi-trait genetic prediction models and that the correlation structure between the traits can also improve accuracy when phenotypes of one trait are missing. This multi-trait DNA pooling framework will increase the utility of commercial data in current genetic evaluations. Technical Abstract: Pooling samples to derive group genotypes can enable the economically efficient use of commercial animals within genetic evaluations. To test a multivariate framework for genetic evaluations using pooled data, simulation was used to mimic a beef cattle population including two moderately heritable traits with varying genetic correlations, genotypes and pedigree data. There were 15 generations (n = 32,000; random selection and mating), and the last generation was subjected to genotyping through pooling. Missing records were induced in two ways: (a) sequential culling and (b) random missing records. Gaps in genotyping were also explored whereby genotyping occurred through generation 13 or 14. Pools of 1, 20, 50 and 100 animals were constructed randomly or by minimizing phenotypic variation. The EBV was estimated using a bivariate single-step genomic best linear unbiased prediction model. Pools of 20 animals constructed by minimizing phenotypic variation generally led to accuracies that were not different than using individual progeny data. Gaps in genotyping led to significantly different EBV accuracies (p < .05) for sires and dams born in the generation nearest the pools. Pooling of any size generally led to larger accuracies than no information from generation 15 regardless of the way missing records arose, the percentage of records available or the genetic correlation. Pooling to aid in the use of commercial data in genetic evaluations can be utilized in multivariate cases with varying relationships between the traits and in the presence of systematic and randomly missing phenotypes. |