Author
SALLAM, A.H. - University Of Minnesota | |
ENDELMAN, J. - University Of Wisconsin | |
Jannink, Jean-Luc | |
SMITH, K.P. - University Of Minnesota |
Submitted to: The Plant Genome
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 8/15/2014 Publication Date: 9/5/2014 Citation: Sallam, A., Endelman, J., Jannink, J., Smith, K. 2014. Assessing genomic selection prediction accuracy in a dynamic barley breeding. The Plant Genome. (8). DOI: 10.3835/plantgenome2014.05.0020. Interpretive Summary: Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotypes and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, but not through validation on progeny performance in a breeding program. We evaluated prediction of progeny performance in a barley breeding population with 647 lines evaluated for four traits and scored with 1,536 DNA markers. We used these data sets to investigate the effect of model on prediction accuracy over time. We found little difference in prediction accuracy among the models confirming prior studies that found the simplest model to be accurate across a range of situations. Surprisingly, we found that training the prediction model using only population founders gave similar accuracy to training with additional descendants. Relative prediction accuracy ranged from 0.03 to 1.0 across four traits and five progeny sets. To understand this variability in prediction accuracy, we explored characteristics of the training and progeny sets. Findings from this study will help breeders design programs when they want to select individuals based on predictions. Technical Abstract: Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on progeny performance in a breeding program has not been investigated thoroughly. We evaluated several prediction models in a dynamic barley breeding population comprised of 647 six-row lines using four traits differing in genetic architecture and 1,536 SNP markers. The breeding lines were divided into six sets designated as a parent set and five consecutive progeny sets comprised of representative samples of breeding lines over a five-year period. We used these data sets to investigate the effect of model and training population composition on prediction accuracy over time. We found little difference in prediction accuracy among the models confirming prior studies that found the simplest model, RR-BLUP, to be accurate across a range of situations. In general, we found that using the parent set was sufficient to predict progeny sets with little to know gain in accuracy from generating larger training populations by combining the parent set with subsequent progeny sets. Relative prediction accuracy (correlation between predicted and observed values divided by the square root of the heritability) ranged from 0.03 to 1.0 across the four traits and five progeny sets. To understand the variability in prediction accuracy that we observed, we explored characteristics of the training and validation populations (marker allele frequency, population structure, linkage disequilibrium) as well has characteristic of the trait (genetic architecture and heritability). Fixation of markers associated with a trait over time was most clearly associated with reduction in prediction accuracy. Higher trait heritability in the training population and simpler trait architecture were associated with greater prediction accuracy. |