Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #295586

Title: Genotype by environment interaction and the use of unbalanced historical data for genomic selection in an international wheat breeding program

Author
item DAWSON, JULIE - Cornell University
item ENDELMAN, JEFFREY - Cornell University
item HESLOT, NICOLAS - Cornell University
item CROSSA, JOSA - International Maize & Wheat Improvement Center (CIMMYT)
item Poland, Jesse
item DREISIGACKER, SUSANNE - International Maize & Wheat Improvement Center (CIMMYT)
item MANES, YANN - Syngenta
item SORRELLS, MARK - Cornell University
item Jannink, Jean-Luc

Submitted to: Field Crops Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 7/24/2013
Publication Date: 12/14/2013
Citation: Dawson, J.C., Endelman, J., Heslot, N., Crossa, J., Poland, J.A., Dreisigacker, S., Manes, Y., Sorrells, M., Jannink, J. 2013. Genotype by environment interaction and the use of unbalanced historical data for genomic selection in an international wheat breeding program. Field Crops Research. 154:12-22.

Interpretive Summary: Genomic selection (GS) involves predicting future performance of new breeding lines on the basis of performance of related lines coupled to high density DNA marker data. It is possible to use data from past experimental trials used by the breeding program for conventional selection. These trials may be "unbalanced" in time and space in the sense that not all lines were evaluated together in all trials. Interactions between genotype and environment in these trials may reduce the accuracy of the predictions they enable. Using the International Center for Maize and Wheat Improvement’s (CIMMYT) Semi-Arid Wheat Yield Trials (SAWYT) we assessed the accuracy of genomic predictions and the potential to subset these trials to reduce the impact of genotype-by-environment interaction on genomic prediction accuracy. We found that there was no difference in accuracy between models accounting for genotype-by-environment interactions and global models that did not require subsetting environments. Data-driven methods of clustering trials into subsets based on similarities in genomic predictions also failed to improve accuracies relative to global models. Using a simulation based on the real SAWYT data, we found that if there were different true genotypic values (TGV) between subsets, there was advantage to modeling GxE in prediction models. In the SAWYT dataset it appears that there is not a consistent pattern of genotype-by-environment interaction such that this dataset cannot be partitioned into subsets that have improved predictive power.

Technical Abstract: Genomic selection (GS) offers breeders the possibility of using historic data and unbalanced breeding trials to form training populations for predicting the performance of new lines. However, in using datasets that are unbalanced over time and space, there is increasing exposure to particular genotype – environment combinations and interactions that may make predictions less accurate. Global cross-validated genomic prediction accuracies may be high when using large historic datasets but accuracies for individual years using a forward-prediction approach, or accuracies for individual environments, are often much lower. Using the International Center for Maize and Wheat Improvement’s (CIMMYT) Semi-Arid Wheat Yield Trials (SAWYT) we assessed the accuracy of genomic predictions and the potential to subset these nurseries using the CIMMYT concept of mega-environments (ME) adapted to a genomic selection context. We found that there was no difference in accuracy between models accounting for genotype by environment interactions and global models using the CIMMYT ME. Data-driven methods of clustering trials into ME based on similarities in genomic predictions also failed to improve accuracies with in ME. Using a simulation based on the real SAWYT data, we found that if there were different true genotypic values (TGV) between ME, there was advantage to modeling GxE in prediction models. In the SAWYT dataset it appears that there is not a consistent pattern of genotype-by-environment interaction among the CIMMYT ME, and this dataset is not balanced enough to partition into new ME that have predictive power.