Location: Animal Genomics and Improvement Laboratory
Title: Big data genomic analysis in dairy cattleAuthor
LOURENCO, DANIELA - University Of Georgia | |
CESARANI, ALBERTO - University Of Sassari | |
TSURUTA, SHOGO - University Of Georgia | |
NICOLAZZI, EZEQUIEL - Council On Dairy Cattle Breeding | |
Vanraden, Paul | |
MISZTAL, IGNACY - University Of Georgia |
Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only Publication Acceptance Date: 10/12/2022 Publication Date: 1/18/2023 Citation: Lourenco, D., Cesarani, A., Tsuruta, S., Nicolazzi, E.L., Van Raden, P.M., Misztal, I. 2023. Big data genomic analysis in dairy cattle [abstract]. Plant and Animal Genome Conference. Available: https://plan.core-apps.com/pag_2023/abstract/e21f9b00-a05e-48b4-92b5-2ad3192a0208. Interpretive Summary: Technical Abstract: Genomic data is accumulating fast to the point where we wonder whether all genotyped individuals can be used in prediction models. With single-step genomic BLUP (ssGBLUP), all genotyped and non-genotyped individuals are combined in a single analysis. When the number of genotyped individuals surpasses 100k, special algorithms to obtain the inverse of the genomic relationship matrix (G^-1) make ssGBLUP feasible. One example is the algorithm for proven and young (APY), which provides a sparse representation of G^-1 because it uses recursions based on a set of core genotyped animals. We used ssGBLUP with APY to run genomic evaluations for almost 4 million genotyped dairy cattle. The analysis involved five of the most used breeds in the US: Holstein, Jersey, Brown Swiss, Ayrshire, and Guernsey. Respectively, the number of genotyped animals was 3.4M, 427.3K, 47.3K, 9.2K, and 5K. Single- and multi-breed evaluations were compared, and the objective was to obtain GEBV reliability (bulls) and predictivity (cows) in the multi-breed that were at least the same as in single-breed analyses. The combined evaluation comprised 29M pedigreed animals, of which 3.9M had genotypes, and 19M cows had 45M phenotypes for milk, fat, and protein in several lactations. Combining breeds with different amounts of data is challenging, especially if genomic information is used. The number of core animals in APY had to represent the dimensionality of genomic information within each breed, therefore, 45K. Changes in software and models were required to enable such a large-scale evaluation with a reasonable computing time. Reliability and predictivity in the multi-breed evaluation were like those in the single-breed, and the combined evaluation required 72 hours to finish. Our results show that large-scale multi-breed ssGBLUP evaluations are computationally feasible. |