Author
Submitted to: International Society for Animal Genetics (ISAG)
Publication Type: Abstract Only Publication Acceptance Date: 4/22/2016 Publication Date: 7/23/2016 Citation: Snelling, W.M., Kuehn, L.A., Lindholm-Perry, A.K. 2016. Inferring genotypes of functional variants in crossbred beef cattle [abstract]. International Society for Animal Genetics (35th ISAG). Abstract Book. p. 26 (Abstract #P1024). Available: https://www.isag.us.Docs/Proceedings/ISAG_Proceedings_2016.pdf Interpretive Summary: Technical Abstract: The current cost of sequencing individual genomes can be justified for influential ancestors of livestock populations, but it is prohibitively expensive to use whole-genome sequencing to genotype large populations. Genotypes for variants detected in sequence might be inferred with haplotype-based imputation, but reported accuracies of imputing sequence variant genotypes are lower than accuracies of imputing high density SNP array genotypes. Linear combinations of high density array SNP were almost perfectly correlated to sequence variant genotypes, suggesting that whole genome approaches might leverage long-range and low-level linkage disequilibrium to infer sequence variant genotypes more accurately than imputation based on haplotypes. Genotypes of animals with low (n=167), moderate (n=8596) and high density (n=1453) SNP array genotypes, as well as genotypes called from sequence of 270 bulls were used to compare haplotype imputation and whole genome approaches to predicting sequence variant genotypes. All animals were from the multibreed Germplasm Evaluation population; the sequenced bulls were the most influential purebred and F1 sires in that population. Genotypes for SNP on a newly available functional SNP assay were predicted with 4 approaches: 1) population imputation, which filled high density and sequence genotypes based on agreement between high and low density haplotypes; 2) pedigree + population imputation, which considers predicted parent and progeny haplotypes to fill high density and sequence genotypes; 3) GBLUP using imputed high density genotypes to infer sequence variant genotypes; and 4) BayesC to infer sequence variant genotypes from high density genotypes. Accuracy of imputed or inferred genotypes was assessed by correlation with genotypes called from exome sequence, available on a set of crossbred sons and grandsons of the sequenced sires (n=42). Mean (SE) accuracies for imputing 1988 functional SNP on chromosome 1 were 0.74 (0.01) without and 0.78 (0.01) with pedigree information. Accuracies of inferred genotypes were 0.44 (0.01) using GBLUP and 0.74 (0.01) using BayesC. While imputation with pedigree had the highest average accuracy, one of the other approaches was more accurate for two-thirds of the variants. Maximum accuracy was most frequently obtained with BayesC (42%), followed by imputation with pedigree (34%), imputation without pedigree (21%) and GBLUP inference (3%). The mean maximum accuracy was 0.88 (0.003), indicating that accuracy of predicting sequence variant genotypes can be improved by applying multiple approaches. Further exploration is needed to characterize conditions contributing to accuracy of each approach. |