Skip to main content
ARS Home » Plains Area » Manhattan, Kansas » Center for Grain and Animal Health Research » Hard Winter Wheat Genetics Research » Research » Publications at this Location » Publication #385221

Research Project: Genetic Improvement of Biotic and Abiotic Stress Tolerance and Nutritional Quality in Hard Winter Wheat

Location: Hard Winter Wheat Genetics Research

Title: Development of the Wheat Practical Haplotype Graph Database as a Resource for Genotyping Data Storage and Genotype Imputation

Author
item Jordan, Katherine
item Bradbury, Peter
item MILLER, ZACK - Cornell University
item NYINE, MOSES - Kansas State University
item HE, FEI - Kansas State University
item Guttieri, Mary
item Brown-Guedira, Gina
item Buckler, Edward - Ed
item Jannink, Jean-Luc
item AKHUNOV, EDUARD - Kansas State University
item Chu, Chenggen
item Ward, Brian
item Bai, Guihua
item Bowden, Robert
item Fiedler, Jason
item Faris, Justin

Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 10/20/2021
Publication Date: 11/9/2021
Citation: Jordan, K., Bradbury, P., Miller, Z., Nyine, M., He, F., Guttieri, M.J., Brown Guedira, G.L., Buckler Iv, E.S., Jannink, J., Akhunov, E., Ward, B.P., Bai, G., Bowden, R.L., Fiedler, J.D., Faris, J.D. 2021. Development of the Wheat Practical Haplotype Graph Database as a Resource for Genotyping Data Storage and Genotype Imputation. G3 Genes/Genomes/Genetics. https://doi.org/10.1101/2021.06.10.447944.
DOI: https://doi.org/10.1101/2021.06.10.447944

Interpretive Summary: Developing and using large numbers of DNA markers is rather difficult and expensive in wheat. The Practical Haplotype Graph (PHG) is a new bioinformatic tool that leverages existing high coverage DNA sequencing data to accurately impute marker data on additional lines with inexpensive low coverage input data. We provide evidence that a custom-built database that represents the diversity in US wheat breeding programs accurately (93%) predicts over 1.4 million variants of the DNA sequence with as little as one-one hundredth coverage input data. The PHG had significantly higher accuracy than the currently popular marker imputation tool called Beagle. The PHG has the potential to become an accurate, expandable, flexible, inexpensive imputation tool for marker genotyping in wheat.

Technical Abstract: To improve the efficiency of high-density genotype data storage and imputation in bread wheat (Triticum aestivum L.), we applied the Practical Haplotype Graph (PHG) tool. The wheat PHG database was built using whole-exome capture sequencing data from a diverse set of 65 wheat accessions. Population haplotypes were inferred for the reference genome intervals, which were defined by the boundaries of the high-quality gene models. Missing genotypes in the inference panels, composed of wheat cultivars or recombinant inbred lines genotyped by exome capture, genotyping-by-sequencing (GBS), or whole-genome skim-seq sequencing approaches, were imputed using the wheat PHG database. Though imputation accuracy varied depending on the method of sequencing and coverage depth, we found 93% imputation accuracy with 0.01x sequence coverage, which was only slightly lower than the accuracy obtained using the 0.5x sequence coverage (96.9%). By direct comparison, PHG imputation outperformed Beagle imputation by nearly 4% (p-value = 0.00027) and proved more accurate imputing rare haplotypes. The reduced accuracy of imputation with GBS data (90.4%) is likely associated with the small overlap between GBS markers and the exome capture dataset, which was used for constructing PHG. The highest imputation accuracy was obtained with exome capture for the wheat D genome, which also showed the highest levels of linkage disequlibrium and proportion of identity-by-descent regions among accessions in our reference panel. We demonstrate that genetic mapping based on genotypes imputed using PHG identifies SNPs with a broader range of effect sizes that together explain a higher proportion of genetic variance for heading date and meiotic crossover rate compared to previous studies.