Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #382733

Research Project: Improving Dairy Animals by Increasing Accuracy of Genomic Prediction, Evaluating New Traits, and Redefining Selection Goals

Location: Animal Genomics and Improvement Laboratory

Title: Scalable mixed model approach for finding the omnigenic core genes

Author
item JIANG, JICAI - University Of Maryland
item Vanraden, Paul
item MA, LI - University Of Maryland
item O'CONELL, JEFFREY - University Of Maryland School Of Medicine

Submitted to: Journal of Dairy Science
Publication Type: Abstract Only
Publication Acceptance Date: 4/8/2021
Publication Date: 6/28/2021
Citation: Jiang, J., Van Raden, P.M., Ma, L., O'Conell, J.R. 2021. Scalable mixed model approach for finding the omnigenic core genes [abstract]. Journal of Dairy Science. 104(Suppl. 1):79(abstr. 204).

Interpretive Summary:

Technical Abstract: To make use of exploding genomic data, we present a scalable mixed model approach for genome-wide association studies (GWAS) that can work for millions of genotyped animals, which we refer to as SSGP. Using simulations, we show that our method is as accurate as EMMAX and is a few times faster than BOLT. SSGP can address the genomic inflation issue in large-scale GWAS in domestic animals. Substantial genomic inflation will arise in GWAS in the presence of polygenic inheritance, even when population structure and relatedness have been accounted for. This can be demonstrated by the non-centrality parameter (NCP) for a SNP that is in linkage disequilibrium (LD) with causal variants, NCP_i is approximately equal to N Sigma_j r_ij^2 q_j^2, where NCP_i is the NCP for SNP i, r_ij is the correlation coefficient between SNPs i and j, q^2 is the proportion of phenotypic variance explained by a causal variant, and N is the sample size. If polygenic effects are not well accounted for, the NCP for many tested SNPs may be big in large-scale GWAS, especially those in domestic animals that generally have small effective population size and strong, long-span LD on the genome. As a result, we may see significant loci everywhere on the genome, even though any causal variant alone has an undetectable effect. We illustrate this phenomenon by leave-one-chromosome-out (LOCO) GWAS with big cow data and simulations, given the fact that LOCO GWAS does not account for effects of causal variants on the same chromosome as a tested SNP. We also illustrate that only a few loci of significance can be found when whole-genome polygenic effects have been accounted for by SSGP. This finding is in line with the omnigenic core versus peripheral gene model that was recently proposed: the few SSGP significant loci correspond to core genes and those LOCO significant loci everywhere on the genome result from peripheral genes that each have a tiny effect. In summary, our method is useful for finding omnigenic core genes that matter in functional studies and targeted genome editing. SSGP is freely available at https://github.com/jiang18/ssgp.