Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #401012

Research Project: Increasing Accuracy of Genomic Prediction, Developing Algorithms, Selecting Markers, and Evaluating New Traits to Improve Dairy Cattle

Location: Animal Genomics and Improvement Laboratory

Title: Big data genomic analysis in dairy cattle

Author
item LOURENCO, DANIELA - University Of Georgia
item CESARANI, ALBERTO - University Of Sassari
item TSURUTA, SHOGO - University Of Georgia
item NICOLAZZI, EZEQUIEL - Council On Dairy Cattle Breeding
item Vanraden, Paul
item MISZTAL, IGNACY - University Of Georgia

Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only
Publication Acceptance Date: 10/12/2022
Publication Date: 1/18/2023
Citation: Lourenco, D., Cesarani, A., Tsuruta, S., Nicolazzi, E.L., Van Raden, P.M., Misztal, I. 2023. Big data genomic analysis in dairy cattle [abstract]. Plant and Animal Genome Conference. Available: https://plan.core-apps.com/pag_2023/abstract/e21f9b00-a05e-48b4-92b5-2ad3192a0208.

Interpretive Summary:

Technical Abstract: Genomic data is accumulating fast to the point where we wonder whether all genotyped individuals can be used in prediction models. With single-step genomic BLUP (ssGBLUP), all genotyped and non-genotyped individuals are combined in a single analysis. When the number of genotyped individuals surpasses 100k, special algorithms to obtain the inverse of the genomic relationship matrix (G^-1) make ssGBLUP feasible. One example is the algorithm for proven and young (APY), which provides a sparse representation of G^-1 because it uses recursions based on a set of core genotyped animals. We used ssGBLUP with APY to run genomic evaluations for almost 4 million genotyped dairy cattle. The analysis involved five of the most used breeds in the US: Holstein, Jersey, Brown Swiss, Ayrshire, and Guernsey. Respectively, the number of genotyped animals was 3.4M, 427.3K, 47.3K, 9.2K, and 5K. Single- and multi-breed evaluations were compared, and the objective was to obtain GEBV reliability (bulls) and predictivity (cows) in the multi-breed that were at least the same as in single-breed analyses. The combined evaluation comprised 29M pedigreed animals, of which 3.9M had genotypes, and 19M cows had 45M phenotypes for milk, fat, and protein in several lactations. Combining breeds with different amounts of data is challenging, especially if genomic information is used. The number of core animals in APY had to represent the dimensionality of genomic information within each breed, therefore, 45K. Changes in software and models were required to enable such a large-scale evaluation with a reasonable computing time. Reliability and predictivity in the multi-breed evaluation were like those in the single-breed, and the combined evaluation required 72 hours to finish. Our results show that large-scale multi-breed ssGBLUP evaluations are computationally feasible.