Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #381294

Research Project: Improving Dairy Animals by Increasing Accuracy of Genomic Prediction, Evaluating New Traits, and Redefining Selection Goals

Location: Animal Genomics and Improvement Laboratory

Title: Invited review: Unknown-parent groups and metafounders in single-step genomic BLUP

Author
item MASUDA, YUTAKA - University Of Georgia
item Vanraden, Paul
item TSURUTA, SHOGO - University Of Georgia
item LOURENCO, DANIELA - University Of Georgia
item MISZTAL, IGNACY - University Of Georgia

Submitted to: Journal of Dairy Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/26/2021
Publication Date: 2/1/2022
Citation: Masuda, Y., Van Raden, P.M., Tsuruta, S., Lourenco, D.A.L., Misztal, I. 2022. Invited review: Unknown-parent groups and metafounders in single-step genomic BLUP. Journal of Dairy Science. 105(2):923–939. https://doi.org/10.3168/jds.2021-20293.
DOI: https://doi.org/10.3168/jds.2021-20293

Interpretive Summary: Pedigree is often incomplete in dairy cattle. Even in the genomic era, the missing pedigree is a source of bias and inaccuracy in genomic predictions when combining all data sources. Traditionally, pseudo individuals, so-called unknown-parent groups, are assigned to missing parents to remove the bias. However, it was unclear how the groups work in the genomic model. We discussed possible statistical models for the groups, the impact on the genomic predictions, and relevance to an alternative model named metafounders. This review helps the readers choose the most appropriate model for less-biased predictions when some parents are unknown.

Technical Abstract: Single-step genomic BLUP (ssGBLUP) is a method for genomic prediction that integrates matrices for pedigree relationships (A) and genomic relationships (G) as the inverse of a unified matrix into a linear mixed model. In dairy cattle, pedigree information is often incomplete. The missing pedigree potentially causes the bias and inflation of genomic estimated breeding value (GEBVs) in ssGBLUP. There are three whole issues relevant to the missing pedigree: unaccountability for selection, missing inbreeding in pedigree relationships, and incompatibility between G and A in level and scale. The issues can be solved using a proper model of unknown parent groups (UPGs). The UPG theory was well-established in pedigree BLUP but is unclear in ssGBLUP. This study reviewed the development of the UPG model in pedigree BLUP, the property of UPG models in ssGBLUP, and its impact on genetic trends and genomic predictions. The similarities and differences between UPGs and metafounders (MFs), a generalized UPG model, were also reviewed. A UPG model (QP) is derived based on a transformation of mixed model equations; this model has a good convergence behavior to solve the equations, but without enough data, it may cause biased genetic trends and underestimated UPG effects due to the confounding among GEBVs, UPG effects, and the general mean for genotyped animals. The QP model can be altered by removing the genomic relationships linking GEBVs and the UPG effects; the altered model results in less bias in genetic trends and less inflation in genomic predictions than the QP model, especially for large data sets. A new model encapsulates the UPG equations into the pedigree relationships for genotyped animals; it works well in simulation in purebred populations. The MF model is a comprehensive solution to the missing-pedigree issues; it is a choice for multi-breed or crossbred evaluations if the data set allows estimating a reasonable relationship matrix for MFs. The missing pedigree is influential on genetic trends but should be negligible on the predictability of genotyped animals when many proven bulls are genotyped. In such a situation, the SNP effects can be back solved from GEBVs of the older genotyped animals, and the indirect prediction based on the SNP effects is useful to calculate GEBVs for young, genotyped animals with missing parents.