Skip to main content
ARS Home » Pacific West Area » Pullman, Washington » WHGQ » Research » Publications at this Location » Publication #385113

Research Project: Characterization of Quality and Marketability of Western U.S. Wheat Genotypes and Phenotypes

Location: Wheat Health, Genetics, and Quality Research

Title: Genomic selection for end-use quality and processing traits in soft white winter wheat breeding program with machine and deep learning models

Author
item SANDHU, KARANSHER - Washington State University
item AOUN, MERIEM - Washington State University
item Morris, Craig
item CARTER, ARRON - Washington State University

Submitted to: Biology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/14/2021
Publication Date: 7/20/2021
Citation: Sandhu, K.S., Aoun, M., Morris, C.F., Carter, A.H. 2021. Genomic selection for end-use quality and processing traits in soft white winter wheat breeding program with machine and deep learning models. Biology. 10(7). Article 689. https://doi.org/10.3390/biology10070689.
DOI: https://doi.org/10.3390/biology10070689

Interpretive Summary: This research paper assessed the potential of machine and deep learning genomic selection models for predicting fourteen different end-use quality traits at two locations in a soft white winter wheat breeding program. Different cross-validation, forward, and across location prediction scenarios were tried for comparing different models and utilization of this approach in the breeding program. Owing to limited seed availability, time constraint, and associated cost, phenotyping for quality traits is delayed to later generations. However, the higher accuracy of prediction models observed in this study suggest that selections can be performed earlier in the breeding process. Machine and deep learning models performed better than Bayesian and RRBLUP genomic selection models and can be adopted for use in plant breeding programs, regardless of dataset sizes. Furthermore, the increase in forward prediction accuracy with the addition of more lines in the training set concluded that genomic selection models should be updated every year for the best prediciton accuracy. Overall, this and previous studies showed the benefit of implementing genomic selection with machine and deep learning models for different complex traits in large scale breeding programs using collected phenotypic data from previous years.

Technical Abstract: Breeding for grain yield, biotic and abiotic stress resistance, and end-use quality are important goals of wheat breeding programs. Screening for end-use quality traits is usually secondary to grain yield due to high labor needs, cost of testing, and large seed requirements for phenotyping. Hence, testing is delayed until later stages in the breeding program. Delayed phenotyping results in advancement of inferior end-use quality lines into the program. Genomic selection provides an alternative to predict performance using genome-wide markers. Due to large datasets in breeding programs, we explored the potential of the machine and deep learning models to predict fourteen end-use quality traits in a winter wheat breeding program. The population used consisted of 666 wheat genotypes screened for five years (2015-19) at two locations (Pullman and Lind, WA, USA). Nine different models, including two machine learning (random forest and support vector machine) and two deep learning models (convolutional neural network and multilayer perceptron), were explored for cross-validation, forward, and across locations predictions. The prediction accuracies for different traits varied from 0.45-0.81, 0.29-0.55, and 0.27-0.50 under cross-validation, forward, and across location predictions. In general, forward prediction accuracies kept increasing over time due to increments in training data size and was more evident for machine and deep learning models. Deep learning models performed superior over the traditional ridge regression best linear unbiased prediction (RRBLUP) and Bayesian models under all prediction scenarios. The high accuracy observed for end-use quality traits in this study support predicting them in early generations, leading to the advancement of superior genotypes to more extensive grain yield trailing. Furthermore, the superior performance of machine and deep learning models strengthen the idea to include them in large scale breeding programs for predicting complex traits.